CONFIDENCE  SETS  FOR  FUNCTIONS  OF  VARIANCE 
COMPONENTS  IN  A MIXED  LINEAR  MODEL 


BY 

ROBERT  M.  BASKIN 


A DISSERTATION  PRESENTED  TO  THE  GRADUATE  SCHOOL 
OF  THE  UNIVERSITY  OF  FLORIDA  IN  PARTIAL  FULFILLMENT 
OF  THE  REQUIREMENTS  FOR  THE  DEGREE  OF 
DOCTOR  OF  PHILOSOPHY 

UNIVERSITY  OF  FLORIDA 


1991 


ACKNOWLEDGEMENTS 


I would  like  to  express  my  sincere  gratitude  to  Professor  Malay 
Ghosh  for  being  my  advisor,  for  proposing  the  problem  of  this 
dissertation,  and  for  guiding  me  to  its  conclusion.  Words  cannot 
express  what  he  has  given  to  me.  I would  like  to  thank  Professors 
Ronald  Randles,  P.  V.  Rao,  Andre  Khuri  and  Joe  Glover  for  serving  on 
my  committee.  Their  efforts  are  truly  appreciated.  I would  like  to 
thank  Andrew  Rosalsky,  Myron  Chang,  Geoff  Vining  and  Mark  Yang  for 
their  special  attention  and  kindness. 

I am  indebted  to  Richard  Scheaffer,  Alan  Agresti , Malay  Ghosh, 
Ronald  Randles,  and  John  Saw  for  guiding  me  through  the  graduate 
program  at  the  University  of  Florida.  If  it  were  not  for  Ronald 
Randles  I would  not  be  here. 

I am  indebted  to  Mike  Conlon,  Art  Smith,  Phil  Padgett  and 
everyone  who  made  the  departmental  computer  system  operate.  If  it 
were  not  for  them  the  simulations  would  never  have  been  completed. 

My  special  thanks  go  to  Carol  Rozear,  Nancy  Pipkin,  Leslie 
Easom,  Marilyn  Saddler  and  Cindy  Zimmerman  for  taking  care  of  many  of 
the  details  and  technical  aspects  of  my  stay  here. 

Last,  but  not  least,  I would  like  to  thank  my  kind  and  loving 
wife  whose  patience  has  been  unbounded. 


m 


TABLE  OF  CONTENTS 


Page 

ACKNOWLEDGEMENTS iH 

ABSTRACT vi 

CHAPTERS 

ONE  INTRODUCTION 1 

1.1  Literature  Review 1 

1.2  The  Subject  of  This  Dissertation 6 

1.3  Matrix  Notations  9 

TWO  GENERAL  LINEAR  MODEL  WITH  TWO  VARIANCE 

COMPONENTS H 

2.1  Introduction 11 

2.2  Henderson  Estimates  of  Variance 16 

2.3  Multivariate  Central  Limit  Theorem 19 

2.4  Jackknife  Estimates 31 

THREE  GENERAL  LINEAR  MODEL  WITH  SEVERAL 

VARIANCE  COMPONENTS 81 

3.1  Introduction 81 

3.2  Multivariate  Central  Limit  Theorem 83 

3.3  Jackknife  Estimates 95 

FOUR  HIERARCHICAL  BAYES  ESTIMATION  OF  THE 

VARIANCE  RATIO 123 

4.1  Introduction 123 

4.2  The  Derivation  of  the  Bayes  Estimator 125 

4.3  Jackknifed  Estimator  of  the  Asymptotic 

Variance 135 

FIVE  RESULTS  OF  SIMULATIONS 144 

5.1  Introduction 144 

5.2  The  Simulation  Results  for  the 

Unbalanced  Model  144 

5.3  The  Simulation  Results  for  the 

Balanced  Model  148 


IV 


SIX 


SUMMARY  AND  FUTURE  RESEARCH 


. 155 


6.1  Summary 155 

6.2  Future  Research 156 

BIBLIOGRAPHY 157 

BIOGRAPHICAL  SKETCH  161 


v 


Abstract  of  Dissertation  Presented  to  the  Graduate  School 
of  the  University  of  Florida  in  Partial  Fulfillment  of  the 
Requirements  for  the  Degree  of  Doctor  of  Philosophy 

CONFIDENCE  SETS  FOR  FUNCTIONS  OF  VARIANCE 
COMPONENTS  IN  A MIXED  LINEAR  MODEL 

By 

Robert  M.  Baskin 
August,  1991 

Chairman:  Dr.  Malay  Ghosh 

Major  Department:  Statistics 

In  this  dissertation  a distribution-free  approach  to  construction 
of  confidence  sets  is  employed  by  using  resampling  techniques  in  a 
general  linear  model  including  covariates.  Initially  only  models  with 
two  variance  components  are  explored  since  this  gives  rise  to  the 
common  situation  of  estimating  the  ’’heritability  ratio.”  Some  of  the 
widely  used  models  in  small  area  estimation  including  the  nested  error 
regression  model,  random  regression  coefficients  model,  etc. 
considered  by  earlier  authors  are  seen  to  be  special  cases  of  this 
model.  The  asymptotic  properties  of  the  jackknifed  versions  of  the 
Henderson  III  estimators  as  well  as  the  jackknifed  estimates  of 
variance  are  studied.  This  is  extended  to  models  with  several  random 
components  but  still  including  covariates.  In  the  situation  of  a 
balanced  one-way  Anova  model  with  covariates  a hierarchical  Bayes 
estimator  of  the  variance  ratio  is  derived.  The  asymptotic 
distribution  of  this  estimator  is  derived  under  a frequentist  model. 


vi 


The  jackknife  estimate  of  variance  for  this  estimator  is  shown  to  be 
consistent  and  used  to  construct  confidence  sets  for  the  variance 
ratio.  Finally  computer  simulations  of  some  of  these  models  are  used 
to  illustrate  the  features  of  the  work. 


vi  1 


CHAPTER  ONE 
INTRODUCTION 


1 . 1 Literature  Review 


While  the  employment  of  linear  models  can  be  traced  back  for 
centuries  in  specific  fields  such  as  astronomy,  where  they  have  been 
used  for  predicting  the  position  of  celestial  bodies,  the  latter  half 
of  this  century  has  seen  linear  models  rise  to  be  a standard  method  in 
many  other  fields,  especially  such  fields  as  genetic  selection, 
psychometrics,  and  survey  sampling.  Indeed,  breeders  of  both  plants 
and  animals  have  used  variance  components  in  linear  models  extensively 
for  predicting  characteristics  of  future  progeny  since  the  landmark 
works  of  Henderson  (1950,1953).  For  purposes  of  selecting  the  most 
desirable  trait  of  progeny  under  consideration,  breeders  produce  what 
is  referred  to  as  a selection  index  based  on  certain  linear 
combinations  of  fixed  and  random  components  under  a mixed  linear 
model.  Discussion  of  selection  indices  in  plant  breeding  may  be  found 
in  Henderson  (1963)  while  a discussion  of  these  indices  in  dairy 
cattle  breeding  may  be  found  in  Lush  and  Shrode  (1950)  or  Henderson, 
Kempthorne,  Searle,  and  von  Krosigh  (1959).  Other  references  in  this 
area  are  cited  in  Gianola  and  Fernando  (1986)  and  Harville  (1990). 

The  fields  of  education  and  psychometrics  have  made  varied  use  of 


1 


2 

linear  models  for  measurement  purposes.  Further  references  are  given 
in  Algina  and  Crocker  (1986). 

In  survey  sampling,  survey  analysts  have  used  linear  models  in 
finite  population  sampling  to  predict  characteristics  of  the  unsampled 
units  based  on  the  observed  sample.  One  of  the  first  recognized  uses 
of  linear  models  in  general,  and  variance  components  in  particular,  in 
the  area  of  survey  sampling  is  due  to  Cochran  (1939).  Because  of 
limited  resources  as  well  as  rapid  development  of  sophisticated 
statistical  techniques  by  such  pioneers  as  Cochran,  sample  surveys 
have  been  widely  used  in  this  country  for  over  half  a century.  These 
surveys  have  provided  reliable  statistics  on  the  national  and  state 
level  with  growing  regularity  and  with  the  advent  of  electronic 
telecommunications  equipment  have  increased  in  both  frequency  and 
application.  However,  in  the  past,  the  use  of  these  methods  in 
sublevels  below  the  state  level  has  been  limited  because  the  estimates 
for  these  sublevels  have  usually  been  based  on  small  sample  sizes,  at 
least  within  the  sublevels,  and  thus  these  estimates  produced 
unacceptably  large  standard  errors.  Therefore,  little  of  the  early 
work  of  survey  analysts  was  devoted  to  producing  reliable  small  area 
estimators . 

In  recent  years,  however,  small  area  estimation  has  grown  into  an 
important  topic  in  survey  sampling.  Many  government  agencies  such  as 
the  United  States  Census  Bureau,  Statistics  Canada,  and  the  Central 
Bureau  of  Statistics  of  Norway  have  been  involved  in  estimating 
population  counts,  unemployment  rates,  per  capita  income,  etc.  for 


3 


sublevels  below  the  state,  province,  or  fylker  level.  In  this  light, 
small  area  estimation  techniques  have  been  devised  that  ’’borrow 
strength”  from  similar  neighboring  areas  for  estimation  and  prediction 
purposes.  A good  review  of  small  area  estimators  may  be  found  in 
Ghosh  and  Rao  (1991).  This  review  shows  how  linear  model  techniques 
can  be  used  to  address  small  area  estimation  problems. 

In  determining  the  effectiveness  of  the  Henderson  method  of 
estimating  the  variance  components  in  a linear  mixed  model 
statisticians  have  proposed  certain  minimal  properties  for  a desirable 
estimator.  In  an  early  paper  by  Yates  and  Zacopancy  (1935)  an  example 
is  given  showing  the  possibility  of  producing  a negative  estimate  of 
variance  by  a standard  Anova  estimate.  With  few  exceptions  (see  for 
example  Smith  and  Murray  (1984))  this  particular  facet  of  the  problem 
has  always  been  a drawback  from  the  point  of  view  of  most 
statisticians.  Typically  the  solution  to  this  problem  is  to  use  the 
positive  part  of  the  variance  estimator  as  in  Crump  (1951)  or  Snedecor 
and  Cochran  (1967).  As  shown  by  Pukelsheim  (1977,1981)  and  discussed 
further  by  Rao  and  Kleffe  (1988)  the  situations  in  which  linear 
combinations  of  the  effects  in  a linear  model  are  nonnegatively 
estimable  are  rare. 

In  a comparison  of  several  variance  estimators  Corbeil  and  Searle 
(1976)  point  out  that  maximum  likelihood  estimators  of  variance 
components  require  that  the  likelihood  function  be  maximized  over  the 
positive  space  of  the  variance  component  parameters.  In  an  extensive 
work  Harville  (1977)  reviews  much  of  the  material  on  standard  Anova 


4 


estimators  as  well  as  maximum  likelihood  estimators  under  a 
multivariate  normality  assumption  and  gives  asymptotic  results  under 
these  assumptions. 

While  criteria  such  as  inadmissibility  of  the  Henderson  III 
estimators  is  discussed  in  Olsen,  Seely,  and  Birkes  (1976),  the 
aforementioned  work  of  Corbeil  and  Searle  indicates  that  these 
estimators  do  not  perform  too  badly  in  comparison  to  others  in 
simulation  studies. 

As  noted  earlier  most  of  the  previous  work  has  been  done  under 
the  assumption  that  the  random  terms  are  distributed  as  multivariate 
normal.  A notable  exception  is  the  work  of  Westfall  (1986)  in  which, 
under  a very  restrictive  set  of  hypotheses,  the  asymptotic 
distribution  of  the  Henderson  estimates  is  shown  to  be  multivariate 
normal.  In  Rao  and  Kleffe  (1988)  a univariate  central  limit  theorem 
for  quadratic  forms  under  fairly  mild  conditions  can  be  found. 

The  Henderson  estimates  give  point  estimates  of  the  parameters  of 
interest  but  in  order  to  obtain  confidence  sets  for  these  parameters 
it  is  necessary  to  use  another  technique.  The  jackknife  is  a general 
technique  for  estimating  the  variance  of  an  estimator  as  well  as 
reducing  the  bias  of  the  estimator.  This  technique  was  introduced  by 
Quenouille  (1949)  and  has  been  expanded  upon  by  several  others.  A 
review  is  given  in  Miller  (1974)  and  a tract  by  Efron  (1982)  is  quite 
informative  on  the  subject.  This  method  has  been  used  by  Arvesen 
(1969)  and  again  Arvesen  and  Layard  (1975)  to  produce  tests  of 
hypotheses  for  variance  components  in  models  with  two  variance 


5 


components,  but  their  models  contained  no  covariates  other  than  fixed 
means . 

In  a different  vein,  Bayesian  models  have  also  been  used  to 
produce  estimates  of  variance  components  in  linear  model  setups. 
Empirical  Bayes  (EB)  approach  in  small  area  estimation  was  given  in 
Fay  and  Herriot  (1979),  Ghosh  and  Meeden  (1986)  and  Ghosh  and  Lahiri 
(1987a,  1987b,  1988)  among  others.  The  technique  is  to  produce  a 
Bayes  estimate  of  the  unknown  parameter  of  interest,  either  by  using  a 
normal  prior  or  by  using  a linear  Bayes  argument  as  put  forth  in 
Hartigan  (1969).  The  unknown  parameters  of  the  prior  are  then 
estimated  by  a classical  method  of  estimation  such  as  method  of 
moments  or  maximum  likelihood.  The  resulting  estimate  is  called  an  EB 
estimator  of  the  parameter  of  interest. 

While  this  approach  may  yield  satisfactory  point  estimates  it  is 
extremely  difficult  to  use  this  technique  to  produce  confidence  sets 
due  to  the  lack  of  closed  form  expressions  for  the  mean  squared  errors 
(MSEs)  of  the  EB  estimators.  As  a result  several  authors  have 
attempted  to  produce  approximations  to  these  MSEs.  Kackar  and 
Harville  (1984),  Harville  (1985,1988),  and  Prasad  and  Rao  (1990)  have 
proposed  estimates  which  rely  on  the  normality  assumption  of  the  error 
terms.  Recently,  Lahiri  and  Rao  (1990)  were  able  to  relax  the 
normality  assumption  without  the  presence  of  covariates. 

A Bayes  approach  which  is  felt  by  some  to  be  an  improvement  over 
the  empirical  Bayes  approach  is  the  hierarchical  Bayes  (HB)  method 
advocated  by  Good  (1965)  and  Lindley  and  Smith  (1972).  This  approach 


6 


is  reviewed  in  Ghosh  (1989)  for  some  balanced  as  well  as  unbalanced 
models  with  covariates.  In  the  HB  method  the  unknown  parameters  are 
assumed  to  have  a distribution  of  known  form  but  possibly  depending  on 
unknown  parameters.  These  parameters  are  then  estimated  from  the 
data.  In  Datta  and  Ghosh  (1989)  the  HB  method  is  used  to  give 
estimates  of  linear  combinations  fixed  and  random  components.  In 
Datta  (1990)  a wide  variety  of  parameters  are  estimated  in  a linear 
model  context  using  an  HB  approach. 


1.2  The  Subject  of  This  Dissertation 

In  this  dissertation  distribution  free  confidence  sets  for 
functions  of  variance  components  in  mixed  linear  models  are  presented. 
The  work  is  an  extension  of  the  basic  asymptotic  results  of  Westfall 
(1986)  and  of  the  earlier  work  of  Arvesen  (1969). 

In  Chapter  Two  linear  models  with  two  variance  components  are 
studied.  Many  previously  presented  models  fall  into  this  class 
including  the  nested  error  regression  models  (Battese  et  al.,1988; 
Prasad  and  Rao,  1990;  and  Datta  and  Ghosh,  1989),  the  random 
regression  coefficients  model  (Dempster  et  al . , 1981;  Prasad  and  Rao, 
1990;  Datta  and  Ghosh,  1989),  and  the  Fay-Herriot  model  (Fay  and 
Herriot,  1979;  Prasad  and  Rao,  1990;  and  Datta  and  Ghosh,  1989). 
Asymptotic  distributions  of  the  Henderson  III  estimators  of  functions 
of  the  variance  components  are  derived.  The  jackknife  estimates  of 


7 


the  variance  components  as  well  their  asymptotic  distributions  are 
presented  and  are  shown  to  be  asymptotically  equivalent  to  the 
un jackknifed  estimates.  Finally  the  jackknife  estimator  of  the 
variance  of  the  these  estimators  is  shown  to  be  consistent.  This 
extends  previous  work  in  the  area  due  to  Arvesen  (1969)  and  Arvesen 
and  Layard  (1975)  where  no  covariates  were  considered.  In  order  to 
achieve  these  results  no  distributional  assumptions  are  required, 
however,  the  distributions  are  assumed  to  have  finite  4+6  moments  for 
some  6 > 0. 

In  Chapter  Three  the  results  of  Chapter  Two  are  extended  to 
include  the  possibility  of  multiple  variance  components.  This  has 
application  in  genetic  selection  where  linear  models  are  often 
employed  which  use  more  than  two  random  components  (Henderson,  1953, 
1963,  1975).  Again  asymptotic  distributions  of  the  Henderson  III 
estimators  of  functions  of  the  variance  components  are  derived.  The 
jackknife  estimates  of  these  variance  components  as  well  their 
asymptotic  distributions  are  presented  and  are  shown  to  be 
asymptotically  equivalent  to  the  un jackknifed  estimates.  Again  the 
jackknife  estimator  of  the  variance  of  the  these  estimators  is  shown 
to  be  consistent. 

In  Chapter  Four  a hierarchical  Bayes  (HB)  model  is  presented  and 
for  the  balanced  situation  an  HB  estimator  of  the  ratio  of  the 
variance  components  is  derived.  This  estimator  extends  the  work  of 
Datta  and  Ghosh  (1989)  and  is  useful  in  giving  estimates  of  linear 
combinations  of  fixed  and  random  components.  The  estimator  is  shown 


8 

to  be  asymptotically  equivalent  to  the  Henderson  III  estimators,  and 
in  fact  is  equal  to  the  usual  ratio  estimate  plus  an  integral  factor. 
Because  of  the  form  of  the  estimator  it  is  clearly  a nonnegative 
estimator  of  variance  and  thus  overcomes  the  drawback  of  the  usual 
estimator  while  retaining  the  good  asymptotic  properties  of  the 
Henderson  III  estimator.  A jackknifed  version  of  the  estimator  is 
presented  and  is  shown  to  be  asymptotically  equivalent  to  the 
un jackknifed  version  which  gives  the  asymptotic  distribution  of  the 
jackknifed  estimator.  Furthermore  the  jackknife  estimate  of  variance 
is  shown  to  be  consistent. 

Chapter  Five  gives  computer  simulations  which  support  some  of  the 
main  points  of  the  previous  chapters.  Simulations  are  carried  out  in 
both  balanced  and  unbalanced  one-way  Anova  models  with  two  variance 
components.  It  is  seen  from  the  simulations  that  the  Henderson 
estimates  require  several  small  areas  to  be  sampled  before  they 
approach  the  correct  probability  of  coverage.  The  Henderson  estimates 
do  not  approach  the  correct  level  of  coverage  until  the  number  of 
cells  is  close  to  forty.  However,  the  Bayes  estimators  did  perform 
well  in  simulation,  even  for  only  eight  or  twelve  cells. 

This  work  provides  a technique  for  forming  confidence  sets  in 
models  which  are  often  encountered  in  practice  and  provides  an 
estimator  which  retains  many  of  the  useful  properties  of  traditional 
estimators  while  overcoming  typical  problems. 


1 . 3 Matrix  Notations 


In  this  work  matrices  will  be  denoted  by  capital  letters  and 


underscored  with  tildes  such  as  A and  X.  The  notation  Iu  shall  stand 

for  a uxu  identity  matrix,  lu  for  a u-component  column  vector  with 

each  element  equal  to  1,  Ju  v for  the  uxv  matrix  lulu  , i.e.,  the  uxv 

matrix  with  each  entry  equal  to  1,  and  Ju  for  the  uxu  matrix  Ju?u. 

Let  col  |B-)  denote  the  matrix  (bT,...,b£)^  and  let 
1 < i < ~k 

. 0 


0 A-  denote  the  matrix 
i = l~ 1 


-1 


^k 


For  the  jackknife  the  following  is  useful  notation.  If 

X = col  {X-}  denotes  the  block  representation  of  a matrix  let 
1 < i < K~lJ 


X • = col  (X„)  denote  the  matrix  with  the  i block  removed  and 
~-x  l<£<kW' 

i 

~i-o  = denote  the  matrix  X with  the  i*^ 

block  replaced  by  a zero  matrix. 

For  any  matrix  X let  col(X)  denote  the  vector  space  spanned  by 
the  columns  of  X and  rank(X)  denote  the  dimension  of  this  vector 
space.  For  any  matrix  X there  exists  a matrix  M which  satisfies  XMX 
X.  This  matrix  will  be  usually  be  denoted  by  X-  and  is  called  a 
generalized  inverse.  Note  that  if  X is  symmetric,  i.e.,  X^  = X,  then 
X can  be  taken  to  be  symmetric  since  |(X-^  + X-)  is  a generalized 
inverse  of  X.  A generalized  inverse  is  called  reflexive  if  it 


10 


satisfies  X”XX”  = X~.  As  shown  in  Penrose  (1955),  for  any  matrix  X 
there  is  a unique  reflexive  generalized  inverse  satisfying, 
additionally,  both  X“X  and  XX”  are  symmetric.  This  matrix  is  called 
the  Moore-Penrose  inverse  and  will  be  denoted  by  X+. 

For  a matrix  X,  let  P^  = X(X^X)X^,  denote  the  matrix  which 
projects  onto  col(X),  the  column  space  of  X. 


CHAPTER  TWO 

GENERAL  LINEAR  MODEL  WITH  TWO  VARIANCE  COMPONENTS 
2 . 1 Introduction 


In  this  chapter,  we  will  consider  the  problem  of  estimating 
functions  of  two  variance  components  under  a mixed  general  linear 
model.  Particular  attention  will  be  paid  to  the  ratio  of  the  variance 
components.  While  some  authors  have  been  interested  in  estimating  the 
variance  components  in  conjunction  with  fixed  elements  of  the  model, 
the  variance  components  themselves  can  be  of  intrinsic  interest  in 
some  fields.  In  Harville  (1976)  the  ratio  of  the  variance  components 
is  related  to  heredity  studies  and  is  referred  to  as  a ’’heritability 
ratio”.  In  Henderson  (1975)  variance  components  are  used  to  model 
genetic  selection  in  dairy  cattle.  In  the  field  of  education 
researchers  study  variance  components  under  the  title  of 
generalizability  theory.  A review  can  be  found  in  Shavelson  and  Webb 
(1981). 

The  estimation  of  components  of  variance  in  the  general  linear 
models  subsumes  the  specific  models  of  several  previous  authors  in  the 
problems  of  small  area  estimation  and  comparative  experiments.  In 
particular,  the  small  area  estimation  problem  known  as  nested  error 
regression,  which  was  considered  by  several  authors  including  Battese, 


11 


12 


Harter,  and  Fuller  (1988),  Prasad  and  Rao  (1990),  and  Datta  and  Ghosh 
(1989),  as  well  as  the  model  known  as  the  random  regression 
coefficient  model  found  in  both  Dempster,  Rubin,  and  Tsutakawa  (1981) 
and  later  Prasad  and  Rao  (1990),  falls  within  the  above  framework  and 
will  be  used  as  an  illustrative  example. 

The  method  of  estimation  used  in  this  chapter,  known  as  the 
Henderson  III  estimates,  was  originally  proposed  in  the  landmark  paper 
of  Henderson  (1953)  in  order  to  estimate  variance  components  in  a 
setting  of  cattle  breeding.  As  with  many  estimators  of  variance 
components,  the  Henderson  III  estimates,  except  for  the  error 
variance,  have  the  possibility  of  producing  negative  estimates  of 
variance.  In  the  situation  of  small  area  estimation  with  not  many  of 
the  small  areas  sampled,  or  in  the  situation  of  a comparative 
experiment  with  only  a few  cells  a negative  estimate  of  variance  is 
not  unlikely.  Although  the  Henderson  procedure  runs  the  risk  of 
producing  negative  estimates  of  variance  components,  the  probability 
is  not  high  for  large  or  even  moderate  sized  samples.  The  present 
chapter  will  focus  on  the  asymptotic  study  of  Henderson  estimates 
whereas  an  attempt  to  rectify  the  problem  of  negative  estimates  of 
variance  components  will  be  made  in  Chapter  4 where  a Bayesian  method 
is  used  to  produce  nonnegative  estimates  of  variance  components. 

It  should  be  pointed  out  that  in  most  previous  work,  the  random 
components  have  been  assumed  to  be  normally  distributed.  Two  notable 
exceptions  are  the  works  of  Arvesen  (1969)  and  Westfall  (1986). 
Arvesen’s  work,  restricted  to  a one-way  layout  with  fixed  cell  means, 


13 


used  the  technique  of  jackknifing  based  on  U statistics.  The  paper  of 
Westfall  under  very  restrictive  assumptions  produced  a Central  Limit 
Theorem  for  estimates  of  variances  in  linear  models  with  no  normality 
assumptions.  The  restrictions  in  Westfall  were  on  the  form  of  the 
model  and,  in  effect,  eliminated  all  possible  fixed  effects  except 
possibly  the  cell  means.  In  the  present  work,  this  restrictive 
assumption  is  eliminated  and  replaced  with  some  very  mild  restrictions 
on  the  form  of  the  design  matrix. 

The  present  chapter  will  follow  this  course.  Initially,  the  form 
of  the  model,  along  with  some  standard  examples,  will  be  presented. 
After  defining  the  model,  the  method  of  estimation  known  as 
Henderson’s  method  III  will  be  developed.  Next,  a multivariate  central 
limit  theorem  for  the  Henderson  estimates  is  proven.  The  jackknife 
estimator  is  derived  and  shown  to  be  asymptotically  equivalent  to  the 
Henderson  III  estimator.  Finally,  the  jackknife  estimate  of  variance 
is  shown  to  be  converge  to  the  asymptotic  variance  of  these 
estimators . 

The  mathematical  form  of  the  general  linear  model  will  now  be 
defined.  In  this  setup  it  is  assumed  that  we  can  observe  Y—,  a 
characteristic  of  interest  on  the  j^  unit  in  the  i^  cell  or  small 
area,  where  there  are  a maximum  of  N units  within  each  cell  or  small 
area,  we  have  sampled  k of  the  cells  or  small  areas,  and  within  the  i^1 
small  area  we  have  sampled  n^  units.  Let  n.  denote  the  total  number 
of  observations.  It  is  further  assumed  that  the  observable 
characteristic  Y—  can  be  modeled  as  a sum  of  fixed  and  random 


14 


components,  to  wit, 


Yij=*oij£o  + *Ji£ii+eij 


for  i = 1 , • • • , k and  j = 1 , 


(2.1.1) 


where  x^j  is  a p0xl  vector  of  known  components,  is  a p}  x 1 vector 

of  known  components,  /?0  is  a p0xl  vector  of  fixed  but  unknown 
components,  is  a pa  x 1 vector  of  random  components  and  e—  is  also 

a random  component.  Furthermore  the  structure  of  the  random 
components  is  governed  by  the  following: 

a)  The  set  {/?1  and  the  set  {ejj}  are  mutually  independent. 

b)  The  set  {ejj}  is  a set  of  mean  0 random  components  with 
Var({e^ j})  = <TgIn>  and  finite  4 + 6 moments  for  6>0. 

c)  The  set  {/^  j}  is  a set  of  mean  0 random  vectors  with 
Var({/?1  i})=<7iliip  and  finite  4 + 6 moments  for  6>0. 

In  the  course  of  the  chapter,  restrictions  will  be  placed  on  {xQjj}  and 


{*li>* 

One  example  of  the  above  model  can  be  found  in  the  nested  error 
regression  model  which  has  the  form 


Yij=4ij£o  + XiVi  + ei. 


for  i = l,**»,k  and  j = l,**-,n*. 


Another  common  model  is  the  random  regression  coefficient  model 


Yij  = *i/0  + *ijVi+«ij 
= =<ij(/Jo  + ''i)+»ij 


for  i = 1 , • • • , k and  j = 1 , • • • , n j . 


15 


For  later  purposes,  we  shall  find  it  convenient  to  write 


and 


*oi 


col 

1 < j< 


„.Rij  ) 


(2.1.2a) 


^ii  = lnix11i 


(2.1.2b) 


Then  model  (2.1.1)  can  be  written  in  matrix  form  as 

+ (2.1.3a) 

Y = Xo£o  + Xl?l  + S (2.1.3b) 

where  Y = col  (Y-  ) and  Y-  = col  (Y-  •)  for  i = 1 , • • • ,k.  In  matrix 
~ -1  1<  j<n^ 

notation,  the  structure  of  the  random  components  will  be  denoted  by 
E(/?0)=0,  E(e)=0,  Cov(/?1)  = (Tjlp  and  Cov(e)  = <r|ln _ . Note  that 
(2.1.3b)  is  a special  case  of  a general  linear  model 

Y=  EX./J.  + e 
j=0  J'J 

which  will  be  considered  in  Chapter  3. 

A special  subclass  of  model  (2.1.1)  is  called  the  hierarchical 
model  and  has  the  property  col (X0) C col (Xj)  . The  random  regression 
coefficient  model  is  an  example  of  a hierarchical  model.  If 


for  i = 1,  • • • ,k  and  j = 1,  • • • ,n^ 


Yij=Xj^0  + vi)+eij 


16 


then 


Xn  = col  {X  •}  and  X,  = 0 X,  • where 
l<i<k  01  -1  i=l'11 


— J — i 


for  i = 1 , • • • ,k, 


so  in  this  case  col(X0) C col(X1) 


2.2.  Henderson  Estimates  of  Variance 


Under  model  (2.1.1)  Henderson  (1953)  devised  estimates  of  the 
variance  components,  <j\  and  it|,  which  do  not  depend  on  the  unknown 
fixed  components  /?0.  These  estimates,  now  commonly  referred  to  as 
Henderson  III  estimates,  but  also  known  as  Anova  estimates,  are  linear 
combinations  of  certain  quadratic  forms  in  the  vector  of  observations 
Y and  these  estimates  are  commonly  used  because  they  are  simple  to 
compute  and  are  widely  available  in  the  standard  computer  packages. 
They  do  have  the  property  of  unbiasedness.  They  are  known  to  be 
nonoptimal  in  some  sense,  in  unbalanced  models,  as  seen  in  Olsen, 
Seely,  and  Birkes  (1976),  but  in  simulation  studies  have  performed 
fairly  well  in  comparison  to  other  estimates  such  as  maximum 
likelihood  and  minique  (see  for  example  Corbeil  and  Searle,  1976). 
However  as  pointed  out  above,  their  biggest  drawback  is  the 
possibility  of  producing  negative  estimates  of  the  variance 
components . 

Now  let  U2= [X0|Xj]  denote  the  full  design  matrix  of  Y and  let 
Ul  = [Jo]  denote  the  reduced  design  matrix  of  Y.  Further  denote 


17 

SSEj  = YT( I - Pyi)Y 
SSE2  = YT(I-Py2)Y 


which  are  often  referred  to  as  the  residuals.  It  is  also  possible  to 
construct  the  estimators  in  terms  of  reductions  in  sums  of  squares 
which  are  linear  combinations  of  the  above  terms.  Finally,  let 

r0  = rank(I-PUi) 


Tj  = trace 


xja-\)h 


r2  = rank(I  -P^) 


The  Henderson  III  estimate  of  is 


= r^SSE2 


and  the  Henderson  III  estimate  of  cr\  is 


*\  = h SSEi-ft  ^e  • 


Since  both  quadratic  forms  are  based  on  projection  matrices  which 
project  onto  the  complement  of  the  column  space  of  X0,  it  is  clear  that 


(I  — Py.)Y=(I-Py.)(Y-EY) 

J J 


j = l,2 


18 


so  that  both  quadratic  forms  are  functionally  independent  of  the  fixed 


effects.  Furthermore  since  I— Pit  projects  onto  the  complement  of  the 

~2 


column  space  of  X4  then 


(!-Py2)Y=(I-Py2)s 


so  that  SSE2  is  functionally  independent  of  all  but  the  error  term  e. 
As  an  example  consider  the  random  regression  coefficient  model, 

Yij  =xi j(^o  + vi)  +ei j for  i = and  j = 


Then 


k “1  k ni  „ / k “i 

SSE,  = E E V?  j- E £ (*i  iYi  j)2/ £ £ *\ . 

i=lj=l  J i=lj=l  J J ' i=lj=l  J 


SSE. 


=lj= 

k n; 


k n = 
lj= 


= E EY  i-E 

i=l.i=l  J i=l 


The  unbiasedness  of  these  estimates,  a well  known  result,  is  an 
immediate  consequence  of  the  following  lemma  (see  for  example  Ghosh 
and  Lahiri,  1987a). 


Lemma  2.2.1.  Let  Yj , • • • , Y.  be  independent  random  variables  with 
E(Yr)=0,  E(Y^.)  = p2,r , and  E(Yr)=/i4<r  where  p4?r<oo  for  r = l,---,k. 
Then  for  any  symmetric  kxk  matrix  A=(arr/), 


a)  E(YTAY)  = £ arr/*2  r = tr(A  diaS  (p2,r)) 

r=l 


19 


b)  Var(Y^AY)  — a^.r(/^4  r — ^2  r ) + 2 Y1  Y1  arr#A*2 , r^2 , r'  • 

r=l  1 < r £ r'  < k 


Proof . Note  that  since  is  a projection  matrix,  so  is  I - Pjj  . Hence 


r2  = rank[I-Py2]  = tr[I-Py2] 
and  by  the  previous  lemma 


E*|  = ^EYT[I  - Py2]Y  = ^<r|tr[I  - Py^  = <x2e. 
Similarly,  for  a\ 

E^  = iEYT[Y-Py2]Y-^E4 

= Fj"  <Ti'tr[?i?(Y  “ Pu2 )Xi  ] + - pux ] - F^l 


2 . 3 Multivariate  Central  Limit  Theorem 


The  main 
of  estimates, 


import  of  this  section  is  to  demonstrate  that  the  vector 

cy  ry  TH 

(Si,  0e )>  satisfies  a multivariate  central  limit  theorem 


CM 


in  the  sense  that 


is  asymptotically  distributed  as  bivariate  normal.  The  technique  is 
to  use  the  projections  of  Hajek  to  get  an  approximating  sequence  of 


20 


independent  vectors  and  then  verify  Liapounov’s  condition. 


In  comparison  to  the  central  limit  theorem  for  this  model  the 


work  of  Westfall  (1986)  should  be  pointed  out  as  providing  the  most 
basic  case,  but  his  model  is  too  restrictive  in  the  sense  that,  in 
general,  it  does  not  allow  for  any  incorporation  of  covariates.  In 
particular,  with  reference  to  the  design  matrix  which  has  been  denoted 
above  by  U2,  the  assumptions  of  Westfall  are  that  the  matrix  U2 
contains  only  values  of  0 or  1 with  exactly  a single  1 in  any  row  of  Xj 
and,  similarly,  exactly  a single  1 in  row  of  XQ.  Further,  U2  satisfies 
the  previously  mentioned  hierarchical  property,  meaning  that 
col(X0) C col(Xj) . This  situation  is  obviously  much  too  restrictive  to 
actually  incorporate  any  sensible  analysis  of  covariance  model.  While 
the  Westfall  model  does  handle  more  than  two  variance  components,  this 
case  will  not  be  covered  until  Chapter  Three. 

Now  since  y2=[X0|X1]  then  U2U2  is  a partitioned  matrix,  so  in 
order  to  find  (U2U2)  one  can  use  a formula  for  the  generalized  inverse 
of  a partitioned  matrix  such  as  found  in  Henderson  and  Searle  (1981). 
Thus  one  obtains 


(2.3.1) 


Let 


(2.3.2) 


21 


where  {XQj}  and  {X^}  are  given  in  (2.1.2).  It  should  be  noted  that 
k k 

X,  = © X-  so  that  PY  = © PY  • Using  (2.3.2)  it  is  possible  to  write 
i=l_11  -l  i=l  ~ii 

(2.3.1)  as 


The  subclass  of  models  known  as  the  hierarchical  models,  of 

which,  random  regression  coefficient  models  are  an  example,  point  out 

a salient  feature.  It  would  be  convenient  to  assume  U2U2  is  an 

invertible  matrix  in  order  to  simplify  calculations,  but  there  are 

many  standard  models,  including  all  hierarchical  models,  for  which 

T 

this  is  not  the  case,  even  under  typical  assumptions  such  as  XqX0  is 
invertible  and  X^Xj  is  invertible.  However,  for  hierarchical  models, 
i.e.,  where  col (X0)  C col (X0)  , there  is  an  extremely  simple  solution  to 
(2.3.1),  namely, 


A direct  calculation  shows  that  if  col(XQ)  C col(Xj) , then  a symmetric 

T 

generalized  inverse  of  (U2U2)  is  given  by 


(yjy2)"  = 


o Q 

Q alh)'1 


This  is  because 


(yjy2) 


Q Q 

Q (x^xj 


x (yjy2) 


yTv  \Tv 

£0  ~0  ~0  ~1 

xjxo  Xifc 


0 

XJX^XTX,)-1 

0 

I 

Q Q 
Q (x^)"1 

xjxo  X 

yTy  xTv 

£l  A0  Aj  Aj 


xJPx^o  lUi 

xJ'Xo  X^ 


= (yjy2)- 


Again  assuming  that  col(X0) C col(X1)  , 


py2=y2(yly2)‘yl 


Q Q 

sj 

Q (xJXj)-1 

= [Q|x1(x?'x1)-1] 


= X1(XJ'X1)-1XT 


= p 


Xi- 


oh 


23 


Observe  that  to  reconcile  this  result  with  (2.3.1),  if 


col(X0)  C col(Xj) , then  ( In < )X0 = 0 . 


so 


(In. -\r^oTUan.-\) 


must  be  identically  a zero  matrix. 

Hence  for  hierarchical  models  in  general,  and  the  random 
regression  coefficient  model  in  particular,  there  is  a major 
simplification  in  the  form  of  SSE2,  namely, 


k 


SSE2=£Yi(In.-pX  .)Y.. 
i=l  1 -ii 


The  following  two  representations  are  extremely  useful 


SSEj  = E (Yj  -Ey.j'Idn. -EYj) 

i=l  1 


EE  (Yi-EYi)TCii(Y.-EYi; 


(2.3.3) 


where  C-  • = XQi (xJXo)_1xJj . 


SSE2  = E ( Yj  - EYj  )T(  In . - Px  ) ( Yj  - EYj  ) 

x l -l  1 1 

-.E(Yi-Evptl„i-Ps1)Soicr.x0Ti(!11.-P5i)(yi-EVi) 


(2.3.4) 


Now  denote 


and 


g1(Yi)=E(SSE1|Yi)  for  i = l,---,k 
g2(Yi)=E(SSE2|Yi)  for  i = 1 , • • • , k 


Lemma  2.3.1.  Under  model  (2.1.1) 


a)  E{(SSE1-E(SSE1))-E(gi(Yi)-Eg1(Yi))}2  = 


T \ \2 


0(P0)(1  <mfx<kchL(^ii^i)) 


b)  e{(sse2 - e(sse2) ) -.E  (g2(¥i ) - Eg2(¥i ) )}2  = 0(P0) • 


Proof . a)  Using  representation  (2.3.3) 


e{(SSE1  - E(SSEj))  - £ (gi(Y. ) - Egl(Yi  ) jf 


= e{  EE  (Yi-EY.jTg.  ^Y.-EYj)}2 
1 < i # J <k  J J 


2 EE  E((Yi-EYi)TCii(Y.-EY.)(Y.-EY.)TC.i(Yi-EYi) 
1 < i 7^  j < k J J J J11  1 


2 EE  tr[Var(Yi)Ci.Var(Yi)C.i] 

1 < i # j <k  1J  J J1 


2 EE  tr 

1 < 1 ^ J < k 


C^x^iXii  +o%l  )C.  .(*%  .X' . + 4 In  ;)C 


2\  vT  i _2 


1JV  !-lJ  1J 


■J1 


< 2(isTskchL(XlixT)^  + 4)2  M tr[CljSjl] 


^ j < k 


25 


< 2(  , max : ,chL(!!lixJ,.)^  + 4)2  £ E tr[Xoi(xTxo)-'xJ .X ^(SjSoJ-'xJj ] 

1 < i < k i=\  j=l  J u 

= 2(  max  chL(XlixJi),?  + 4)2  £ tr[xJ.Xoi(xTxo)-1xJxo(xTxo)-1] 
l<i<k  1 i=i 

= 2( max : chL(XlixT.)«T2  + (T2)2  tr^X^^)-1] 

1 < l < k 

= 2(i  <mfx<kchL(XlixJ'i)^  + 4)2  tr[Ipo] 

= (1<rafx<kchL(^ii^i))2°(Po)- 


b)  First  observe  that  the  set  of  vectors  {Y^}  is  an  independent  set 
by  structure  of  the  model.  Secondly  since  Cj  is  symmetric  then 
Cj  may  be  taken  to  be  symmetric.  Finally  note  is  an 

idempotent  matrix,  i.e.,  ( C-j  . Cj  > ) (Cj  # Cj  ) = (Cj  _ Cj  . Cj  _ )Cj  . = Cj  # CJ"  _ . 
Using  representation  (2.3.4), 


e{(SSE2  - E(SSE2)  ) - £ (g2( Yj  ) - Eg^Y-  ) )}2 


2 E E E (Yj  -Evp'd  -px  )x0icr.xj .(I  -px  XYj-EVj) 

1 < l ^ j < k 1 -li  J 1 -1J  J J 


(Yi-EYi)T(In.-P5]i)XoiCf.xJj(II!.-P5i.)(Yj-EYJ) 


2 i<F^<ktr[  (!”i-pxli)?oiCr.x’j(in.-pS|>|inj 


26 


I»j-p5lj»oj6.-.5ji(Ini-px,i)4l„i] 


= 2< ,ZZ  tr[xJi(In.-P!s  )XoiCr.xT(i  p )x0.crj 

1 1 -j—  J \ K *■'  J 


< 24.E  itrCxJid^  - P5i.  )XoiC1-.xJ'j(!„  . - Px_  .)s0jcr.  ] 


= 24tr[C1.S1-.C,.C1-.] 
= 2^tr[C1>CJ-J 
= 2<Tgrank[C1  _ ] 


< 2<Tgrank[X0] 


= 0(p0). 


QED 


As  a continuation  of  the  remarks  about  hierarchical  models  in 

general,  and  the  random  regression  coefficient  model  in  particular, 

note  that  for  these  models  Lemma  2.3.1.b)  is  irrelevant  since  for 

these  models  SSE2  is  exactly  a sum  of  independent  functions  and  Lemma 

2.3. l.b)  is  showing  that  SSE2  can  be  approximated  by  such  a sum. 

Now  the  major  objective  of  obtaining  a central  limit  theorem  for 

the  vector  of  estimates  (<ij,  <7g)^  can  be  seen  as  equivalent  to  a 

central  limit  theorem  for  the  vector  of  projections  ( 5?  g^fY  - 1 ; 
k xT 

.X,S2(Yi)J  • Denote  the  vector  of  components  we  are  attempting  to 


27 


estimate  by  q2 , i.e.,  q2  = (erf,  <r2)^,  and  denote  the  vector  of  estimates 

by  q2  = (a2,  • The  vector  of  estimates  q2 , is  a linear  function  of 

T 

(SSEj,  SSE2)  so  we  can  write 


(5h® 


where  H,  is  an  invertible  matrix,  i.e., 


0k  = 


0 rT" 


The  following  assumptions  are  sufficient  to  have  a central  limit 
theorem  for  q2 . 

A1  i ) <Tg  > 0 


ii}  Ap°  i<T<kchL(^i^i) 


0 as  k— *oo. 


iii)  lim^-  r2  is  positive, 
k— +ook 


iv)  limi  r,  is  positive, 
k— *ook 


\ l /SSE,\ 

')  j7  Va-i\SSE  J Y where  V is  positive  definite. 


Note  that  A1  ii)  is  a relatively  mild  condition  since,  first  of  all, 
most  authors  assume  p0  is  fixed,  and  secondly  for  the  standard  models 


28 


max  chr  (X  -X?-  ) < max  n-  <N.  Assumption  A1  ii)  is  needed  to  show 
1 < i < k “1  < i <k  1 - 

that  the  SSEj  term  can  be  approximated  by  a sum  of  independent  random 
variables  which  is  used  in  the  proof  of  the  central  limit  theorem. 
Taken  together  assumptions  ii),  i i i ) , and  iv)  imply  that 

vi)  lim^-  ( i 1 ) is  finite. 

’ k— >ook  V r0  ) 

Now  combining  iii),  iv),  and  vi)  we  see  that 

lim  f Hr1  =H”1 
k— >ook  k * 

where,  as  the  notation  implies,  H*1  is  invertible. 

Next,  if  we  observe  that  pak  < rank(U2)  < (Pjk  + p0)  , then 

Ur2  = f(n-  “ rank(Pn  ))  has  a limit  if  and  only  if  lim^n.  = n*  exists. 

K K u2  k— ook 

The  value  n*  may  be  thought  of  as  long  run  average  cell  size. 

Theorem  2.3.1  Under  model  (2.1.1)  with  assumption  A1 

Proof . By  the  Cramer  Vald  device  it  suffices  to  show  for  any  nonzero 
, d£R2\{0},  that 


N2(Q,I2) 


vector 


29 


rjk  (dTVkd)  2dT 


/E«i(Yi)\  /Esi(Yi)> 

h 

VEg2«i)'  VEs2(Yi)' 

i=l  i=l 


N(0,1) 


k 


/E  *i(Yi)> 


For  convenience  denote  qk = d 


= HT[i=l 


E s2( Yi  )y 

i=l 


and  Qk  = dT  (gg^). 


Note  that 


Var(Qk)  = Var(d1SSE1  + d2SSE2) 

= VarfdjTd  - Pyi)Y  + d2YT(I  - Py^Y) 

= Var{d1[^xT(i_pyi)xi^1+2^xT(i_Pyi)e  + eT(I-Pyi)e] 
+ d2[eT(I-Py2)e]} 

= Var(d1[/?J'xT(l_Pyi)x^1]) 

+ 4d;Var(^xJ’(I-Py  )e) 

+ Var(eT[d1(I-Pyi)  +d2(I-Py2)]e) 

> Var (eT[dl ( I - P^  + Py2  - Py^  + d2( I - Py2)  ]e) 
>min(p4?e-4;  24){d?tr[Pu  - Pp  ] + (di  + d2)2tr[I  - P,,  ]} 

~ ~ 1 E 2 


30 


where  /x4  e denotes  the  fourth  moment  of  e^j.  Using  assumptions  A1 , we 
see  that  there  are  positive  constants  0<m1<M1<oo,  so  that 


0 < nq  < ^Var(Q^)  < Mj  < oo. 


By  Lemma  2.3.1  Var(q^ — Q^)  = 0(p0)^  l<iX<kChL('li-^i))2' 


Thus  these  are  positive  constants  m2  and  M2,  so  that 


0 < m2  < ^Var(qji)  < M2  < oo. 


It  follows  from  theorem  2 of  Whittle  (1960)  that  for  i = l,***,k 


E|vT(in. -pUi.  )Yi  -eyT(i„.  -PUii)Yi  |2+<. 


<L1(2  + 61)D1(rank(J„.-Pu  J)1+2 

1 -11 


EiYldn  -Py,  .Wi-EYTd „ -P„  )Yj 


| 2+<5n 


6~ 
i+4 


<L2(2  + <i2)D2(rank(In.  -Pu  ))  2 

i -2,1 


where  Lj  and  Dj  are  independent  of  k,  Lj  is  a function  of  <5 j only,  and 

1 <mfX<k{El  l-il  1 4+2Sl  ’ E I 1 2+^>  < Di  V D2 . 

Now  since 


31 


K 2 + 6 

£e(|yT(I  -P  )Y.-EYT(I  Py  )Y  I j)=0(k)  for  j = l,2 
i=l  1 1 ~Ji  1 1 1 ~Ji  1 


then 


E E(  |dlgl(Yi)  +d2g2(Yi)  -E(dlgl(Yi)  +d2g2(Yi)) \2+5)  = 0(k). 
i=l 


Taken  together  with  the  statement  that  ^Var(qk)>m2  this  is  what  is 
necessary  in  the  Liapounov  condition  to  show  that 


>fk(dTVkd)"2(qk_Eqk)-iN(0,l). 


This  fact  proves  the  univariate  central  limit  theorem  from  which  the 
multivariate  central  limit  follows. 

2.4.  Jackknife  Estimates 


The  idea  of  the  jackknife  estimate  was  originated  by  Quenouille 
(1949),  then  extended  by  Quenouille  (1956),  and  finally  canonized  by 
Tukey  (1958).  While  adapted  to  nearly  all  areas  of  statistics  by 
various  authors  the  present  work  follows  the  path  of  the  trailblazing 
paper  by  Arvesen  (1969)  on  jackknifing  U statistics. 

The  concept  of  the  jackknife  is  simple  but  powerful.  Suppose  a 
sequence  Yj , • • • , Yn  of  independent  observations  is  taken  from  a 
distribution  F.  We  wish  to  estimate  some  parameter  9 = 0(F)  using  a 
functional  statistic  #(Yj, • • • ,Yn).  It  is  possible  to  create  a set  of  n 


32 


estimates  of  9 by  deleting  the  observations  one  at  a time  and 
recalculating  the  estimate,  0 _ ^(Yx,  • • • ,Y^_j  , Yj+1,***,Yn)  based  only  on 
the  set  of  n-1  observations  with  the  i^  observation  deleted.  These 
deleted  values  can  be  used  to  produce  what  Tukey  called  pseudovalues, 
that  is, 


^ = n0  — (n-1 for  i = 1 , 


,n. 


These  pseudovalues  can  be  used  to  provide  an  estimate  of  9, 

1 nl  „ 

namely,  9 jac  9^  _ ^ ^ , but  of  at  least  as  great  an  importance  is  the 

fact  that  the  pseudovalues  give  an  estimate  of  the  variance  of  #jack- 
The  method  of  estimating  the  variance  was  also  proposed  by  Tukey.  This 
estimate  is  of  the  form 


Var(0)=S^ii:i(^(_i)-^jack)i 


It  is  the  purpose  of  this  section  to  show  that  the  jackknife 

estimates  of  the  variance  components  obey  a central  limit  theorem,  as 

in  the  previous  section,  and  additionally  the  jackknife  estimate  of 

the  variance  of  the  estimator  converges  in  probability. 

To  deal  with  the  jackknife  the  following  notation  will  be 

necessary.  For  each  cell  or  small  area  we  have  denoted  Y-  = 

col  . {Y^  •)  so  that  Y=  c<pl  (Y-).  Corresponding  to  this  partition  of 
1<J<1  J 1 < i < k x 

Y is  a partition  of  X0,  Xj,and  e so  that 


33 


Yi=Xoi^0  + Xli^1  + ei 


for  i = 1 , • • • ,k. 


A notation  to  denote  the  deletion  of  the  i^  group  is  also  needed. 
Denote  all  of  the  observations  except  the  i^  group  by 


*-i  = 


col  (Y,,) 
1 <£<k  £ 


(2.4.1) 


and  similarly 


col 
1 < t < k 

i 


() 


col 
1 < L < k 


«i£> 


(2.4.2) 


so  that 


S-i 


col  (e.) 
1 <£<k  £ 

M i 


~-i  =X 


o,  - i^o  + , -i^i ? _i + §_i • 


Next  denote  the  corresponding  design  matrices  by 


34 


Recalling  representation  (2.3.2)  let 


and  Py  -px  .+(In.-n,-px  . )5o, -i?i  ,-iYo,-i<!n. -»i  “ PX 

""l  j ~ 1 *1  j ~ 1 "U  j - 1 1 1 ) 


Then  SSE 


1,-i  = YTi(i„..„.-Pxo>_.)v_i 


SSE2.-i=Yli<!n.-ni-pU,  i 

1 x 


"o,-i  =rank(In.-n.  -pn  .) 

'l,-i=tr^,-i(Jn.-ni-PUl  .)*i,-i]' 

V-i  =rank(Jn.-n.  -pU  .) 

’ 1 ~2 , - 1 


For  the  deleted  versions  there  are  formulations  similar  to  (2.J 
(2.3.4),  namely 


SSEi,.i=E.(YJ-EVj)T(In.-Cjj>.i)(Vj-EYj) 

(*i) 


(2.4.3) 

.) 

i 

(2.4.4) 


.3)  and 


(2.4.5) 


35 


and 

SSE 


2,.i  = 5.(Vj-EYj)T(I  -px  )(Yj-EYj) 

’ J ± 1 J J J ~1J  J J 


-E.(YJ-EYj)T(i„.-Pxl.)5oj5i;-i5jj(Inj-Pxlj)(Yj-EYJ: 


-EE 

1 < £ ^ m < k 

(#i) 


-EYe)T(  In,- Pxt^XotC -_iXTm(Inm  - P?im)  (Vm  - EYm) 


(2.4.6) 


The  jackknife  estimate  of  a 2e  is 


^2  , -2  _ k-1  v'  ^2 


where  <j|  = ESSE2  and  <7^  = p— ^ — rSSE2 


(2.4.7) 


The  jackknife  estimate  of  a 2 is 


r 

-2  _ 1. *2  k-1  V'  -2 


E-2 

°j  -i 
i=l  1 


(2.4.8) 


where  &?  = EgSE.  - * ° SSE,  and  <r2  ■ — SSE,  . — ~<r2 

i 1 1 l r2r1  i 1,-1  r1?_i  1,-1  ri,-i  e,-i 


As  an  example  consider  the  random  regression  coefficient  model. 

Note  that  for  SSE2  the  multiplier  — r prevents  the  term  V a2  • from 

2,-i  i = l e’_1 

adding  up  to  a multiple  of  a g.  Specifically,  in  the  balanced  case, 
with  r2?_j  The  same  for  all  i,  then  the  jackknife  estimate  collapses  to 


36 


the  usual  estimate  of  a\. 


However  for  the  unbalanced  case 


<72 


2 _ k-2  k-1  v'  *2 

e , jack  e k 


L=1  2>-1 


k r 


_ 1 1 + 1 

i = Hr2 , - i r2  r2 


SSE 


2,-1 


i.i.2  k-1  1 ccp  k-1  v'  2 2>  iccr 

k(7e  k r^tiSSE2,-i  k r2,_iSSE2,-i 


ae  (k_1)5'e  kic1^r2,  r2,_iSSE2,-i 


*2  k-1  V-  2 2 , i qqp 

e k r2,_iSE2,-i- 


Since  r2 — r • is  bounded  by  the  fixed  number  N,  then  the  second  term 

2,-1 

converges  to  zero  in  probability  and  the  jackknife  estimate  is 
asymptotically  equivalent  to  the  usual  estimate. 


1 k 

Recall  that  lim  t-  ^2  n-  = n*  exists  as  a consequence  of  assumption 

k — ►on  k • _ -i  l 


k— >oo  k j = i 

A1 . Denote  nx^  = n+,  n2  f * = n*  — pj  , and  finally  denote 


T,  = -LrSSE, 
1 n1+k  1 


(2.4.9a) 


T2  = -3-j-SSEo 

2 n2*k  2 


(2.4.9b) 


37 

The  quantities  Tx  and  T2  are  more  convenient  to  work  with  and 
asymptotically  equivalent  to  the  corresponding  terms  with  the  correct 
divisors . 

Lemma  2.4.1.  Under  model  (2.1.1)  with  assumptions  A1 

a)  VarCTj-i  SSEX)  =o(k_1) 

b)  Var(T2-^SSE2)  =o(k"1) 

Proof . 


a)  Var(T,  — 4-  SSE,  ) = Var(-iT:  SSEj-i  SSEX) 

- (^-^>2v"-<sse.> 

= (^E-1)27?Var(SSE') 

= o( 1 ) • 0(k_2)0(k) 

= o(k-1 ) . 

b)  Var(T2-a2e)  =Var(-i¥SSE2-isSE2) 

U2*K  * 

= (^-'!5)2var(SSE^ 


38 


= (A"1)HVar(SSE  2) 

n2*K  1'2 

= o(l)0(k-2)0(k) 


= o(k"1). 


Next  for  j = 1 ,2  and  i = 1 , • • • ,k  let 


1 ccp  . 

(k-l)n  j+ 


so  the  jackknife  estimate  involving  Tj  is 


T 

j, jack 


i = l 


By  using  (2.4.6)  we  see  that 


n2*T2,  jack 


n2+[kT2-^ET2  .J 

K i=1  2, 


= sse2-I£sse  . 

ki=l  2’  1 


: E (Yj-EY.)T(In.-Px  .)(Y.-EY. 
j=l  J J J ~1J  J J 


39 


1 < j #«<k 


k k 


-£  E £ (Yj-EY.)T(I  -Px  ,)(Yj — EYj) 

k i = l j # 1 J J J ~ij  J J 


+iz  EE  (Yj-EVj)^;  p )X  c-_iXT(I  -px  KY^EV,) 

i = l J -1J  J ’ t -l£ 

(#i) 


— n2*^2 


£ EE  (Yj-EYj)T(I  -PX  .JXojCT.xJa „ -px  .KYj-EYj) 

2l<j^£<kJ  J J~1J  J e ~l£ 


E <E/YrEYf)T(in(PXi4)xof(C7i_icr.)xT(!n(-P$|j)(Yf-EY<) 


i=l£^  i 


£ *l£' 


+ ^E_5E^i(YrEYpT(In.-PXi.)Xoj(C7j.rCrjxY(InrPJif)(Y(-EY<) 


4=11  < j ^ C < k 

(#i) 


« ~i£ 


(2.4.10) 


For  the  next  two  parts  the  following  two  technical  lemmas  are  needed. 

Lemma  2.4.2.  If  Px  and  P2  are  projection  matrices,  i.e.,  symmetric 
and  idempotent,  then  tr[P1P2]  < rank[P1]  . 


Proof.  rank[Pa]  = tr[Pj  = tr[PxP2] + tr[Px( I - P2)  ] > tr[PxP2], 
where  the  last  inequality  follows  from  the  fact  that 


40 


tr[P1(I-P2)]=tr[P1(I-P2)P1] 

and  this  is  the  trace  of  a nonnegative  definite  matrix.  QED 


that 


Recall  that  C, . = £ xj.  ( In . - Px  . )XQi  . 

i=l  1 -l1 

ra  < Po  = rank(X0) . 


Denote  ra  = rank(C1)  so 


Lemma  2.4.3  Under  model  (2.1.1)  it  is  possible  to  choose  a sequence 
of  < ra ) integers  l<i,<*-*<i£<k  so  that 

col(£xT  ,(In.  -PX  . )X0i.)=col(C1>) 

j=l  J Xj  ~Xlj  J 

Proof . Choose  the  minimal  ix  so  that 


rank(xji  (I 

1 1 


T T 

If  rank(X(U  (In-  — P\  )-oi  ) = ra  fhen  we  are  done, 
1 II  ~ 1 i i 1 

minimal  i2  > ij  so  that 


else  choose  the 


ranked  (In  -Pv 

1 1 1 ~ 1 1 


)tfi, 


+ (In;  -Pv  . )>rank(X1i  (I.  -P 


-01c 


By  induction,  there  is  a sequence  1 < i < •••  < ig  < k where  1 < i < ra  < p0 
so  that 


,'T 
j=l 


rank(EX01i.(In.  - ?X 


-li 


)X0i  .)  =rank(C1>)  =rfl. 


If  I = {i1,---,ig}  then  for  any  j£I 


QED 


41 


coi(xJj(!nj-Pxi.)S0j)ccoi(c,.)  = coi(iE  xJi (In.-PS|.W0i) 


Now  since  each  X^.(Int— )X  • is  nonnegative  definite,  for 
J ~ J ~lj  ~ J 

j = 1 , • • • , k then  for  any  j £ I 


c°l(Xpj(In  . — ,)X0j)Ccol(Clj_j).  (2.4.11) 

J l J 


According  to  Henderson  and  Searle  (1981),  condition  (2.4.11) 
guarantees  that  C~  _j—  is  nonnegative  definite,  for  j£I. 

In  addition  to  assumption  A1  for  the  central  limit  theorem  other 
assumptions  for  the  jackknife  are  needed. 


A2  !)  (l<T<k  ChL(S>i^i))2  ^ - 0 


as  k— >oo 


k 

ii)£ 


£ max  max  diagonal  element  of  (In  — Pv  )X  • 

i = ll  < i <k  1 <f<nj  V'nj 


(c1>.j-cr.)xt(in.-p5..)}=Q(po) 


Lemma  2.4.4  Under  model  (2.1.1)  with  assumptions  A1  and  A2 


Proof.  It  suffices  to  show  that  the  last  three  terms  in  (2.4.10) 
converge  to  0 in  probability  when  multiplied  by  ^k. 

^ ^ 1 <i"J^<k^'i_E~i^T('Jni-Pxii^'oi'1''°^In£_P^i^~£_E-£)} 


42 


2 EE  tr[(I  -PX  )XoiCijT(I  p )Var(vt)(I  -PX  ) 

1 < i ^ i < k 1 n e -ie  t -it 


< i ^ L < k 


x„{er.&a„i-rxli)v"-«i)] 


=24  ee  tr[(I „ -px  )xoicr.xj'(j  P x )So(cr.xji(i  -P x ) 

l<i^£<k  1 ii  e ~l£  1 11 


k k 


< 24.ee  tr[x^  « - px  )x0icr.sj£a„  - Pxi()x0<cr.  ] 


= 2<r|tr[gi.  C^.C^] 


= 2<Tgrank[C1  # Cx  . ] 


= 0(Po) 


1 1 


> K[|n<?^<kaJ_EYJ)T(I"j_^j)-“J(cr'-rEr-)!!»f  ‘W 

( # i) 

(Yj-EY,)]2} 


k k 


= E Ee{[  EE  (YrEY.)T(in.-Px  .)X0.(C7  _i-cr.)xT 
i=l  i'=l  Ll<j^C<k  J J J Yjj  oj  i,  i ot 

( 7^  i ) 


(I"r%>^-^)<SE,<k(VElj')T(I»y-pxiJ,«oj' 

( # i') 

(57, -i '-CT.1&  (Int,-Pxi4,)(V{,-EYt,)} 


43 


2E  £E  EE  (yrEY,)T(!  -px  ,)xoj(c- ..i-crjxJ ; (I „ -p x ) 

i=l  i'-\  1 < j ^ < k J J J “ij  J l -it 

(Y(  - E3t*)T  «n<,  - Psi4,)*«rf(cr, -i  - GT.  }S?j(  In  j - PX,  jOf  j - EJf  j) 

4 k{  i <££<  V 

( # i,i') 


k k 


<2<4£  E tr[C  ,(C~  .,-Cr.)C 

j/=j  M_1  J-1  A?-1  A?  A?  1 A,  1 


k k 


<241  E tr[C  :(C“  .-C7JC  .,(C^_.#-er.)] 


k k 


:arljii5itr[(gl!.icr,.i-s1,.icr.)(c1>.A,-i'-e.,-i'®r.)] 


= 2<E 1 E{tr[(C1>_ic-_i-c,.cr.)(C  ,c-  ,-c  ,cr.) 

1 = 1 i=l 


+ 2tr[xJi(Illi-P5].)XoiCr.(Cl!_i,C-_i,-51;_i«-.)] 


+ ‘r[xJi(in.-PSi.)xoicrjiKinil-Px].,)xo.,c;-.]} 


k k 


2<E  E tr[(C]  jC~.j-Cj.C-  )(C  -i'CT-i'-fix.Cr. 

i = l i'=l  ’ ’ 


)] 


+ 4<e  tr[cr. ?! . (c  iC-  i - c, . cr. ) ] 

i=l 


+ 2<r4etr[C1>C7>C1>C7J. 


44 


Note  that  is  is  possible  to  choose  generalized  inverses  to  satisfy 
is  symmetric  and  Cj  _jC“  is  symmetric  for  i=l,*--,k.  If 
necessary  a Moore  Penrose  inverse  can  be  used  as  the  generalized 
inverse.  If  this  condition  is  satisfied  then  Cj  ,-l’  ^or 

i=l,...,k,  as  well  as  C1<C7.  are  all  projection  matrices.  Recalling 
the  definition  of  I from  above  we  see  that  for  any  i ^ I that  Cj  _• 
has  the  same  column  space  as  Cj _ . Thus  both  CjCj.  and  Cj  ^ ^ ^ ^ are 
projections  onto  the  same  column  space.  Since  projections  are 
ique , C,  C7  = C„  -C7  ••  Thus,  the  last  term  above  is  equal  to 

7 ~ l • ~ 1 • -l  j — l**!)  — 1 


un 


= 2<r4 


4§E  E tr[(cl5_ig-_i-c1.cr.)(ci  ^'-Gi.GT.n 

i e ii'e  i ’ 1,11,1 


+ 4cre  E ^{tr  [Si . Cl . _ i 5i , _ i ] “ rank  [Cj . Cx  # } 


+ ~aera 


— o~4 


2*4eE  E tr[c  iC-.j-Cj.crjCC  ,c-  ,-Cj.cr.)] 

i eii'ei  ’ 1,11,1 


+ 4oS  E i(rank[C1 , _ iCj , _ i ] - rank  [Cj . Cj . ]} 


+ 2crera. 


Note  that  if  Px  and  P2  are  projection  matrices,  then  Pj  — P2  is 
also  a projection  matrix  if  PjP2  = P2PX  = P2 . Furthermore,  by  the 
construction  of  I,  col(CJ  _j)Ccol(C1#)  for  i£I.  Now  under  the 


45 


assumption  that  C,  -C.  • , for  i = as  well  as  CltC1#,  are 

symmetric,  they  are  seen  to  be  projection  matrices  since  they  are 
idempotent.  Therefore,  because  col(Cj  _^ ) C col(Cj # ) , then  the  matrix 
C1  .C“  Cj  C7  is  a projection  matrix  and  thus  nonnegative 
definite  for  i 01. Thus 


trt(Clj.iC-_i-C1.Cr.)(C1_i,C-_i,-C1.C7.)] 


< min|rank(C1-C1< -Cj  _.CX  _•),  rank(C1>C1<  — _ ^/C”  _£/)}• 


Hence  the  last  step  of  the  equation  above  is  bounded  by 


<2<4  E £ rank(C1.CE-C1  C~  _{) +2cr%ra 
ieii'Gi  ’ ’ ’ 


<2<Teri  + 2<Tera- 


Finally,  the  last  term  is  somewhat  problematic. 


iii)  First  consider  the  terms 


for  i0I.  According  to  the  definition  of  I,  if  i0I  then 

col(Cj  _^)=col(CE)  so  it  must  be  that 
T 

C°1  (XQi ( In . - P\  . )XQ^ ) C col (Cx  _j).  As  pointed  out  earlier,  this  is 


46 


the  sufficient  condition  in  Henderson  and  Searle  to  show  that  C,  •— 
£7.  is  a nonnegative  definite  matrix  for  each  i £ I . Hence 


E{  .E.(Yj-  EYj)T(inj-  Pxlj)Xoj(C7,_i-cr.)x0Tj«„j-  px,j)(Yj-EYj)} 


-„2 


»e.Etr[C  ;(C  j-CJ] 
101 


= 4 . -rc, . or.  ] + [xjj  (In . - P5i  i )xoi  cr.  ]} 


= 4.Etr[xT(i11.-p5i.)x0igr.] 


<4  .Etr[xJ.(I  -px  )x0icr.] 

1=1  1 ~11 


= 4tr[Ci.C1J 


_ 

-aer 


a 


This  shows  that 


i £ E (Yj 


j ) 


(Jnj'%j)-oj(6^-i_er-)-°j(JnrPX1i)(¥j"EYj) 


-1J 


converges  to  0 in  Lj  and  hence  in  probability. 


47 


For  those  terms  where  i € I a different  kind  of  argument  must  be 


used . 


a + 5^  < kE(Vj  - EYj)T(  In  J -eSi  d )xoj(cr, -i  - cr, . )xJjCC7,_i  - cr, . Jsjj 
( /i) 

(i„j-Pxi.)(Yj-Eyj)(Y4-Ev/(!nf-PSi 


(Inf-Pxi£)(X«-EY«)) 


ZE  iE  {E  4m(j)(/*4.e-3<r« 

ielj^im=l 


+ 2-|tr[(In.-Px  )x„j(C7!.i-cr.)xJj(in.-Px  )xOJ(C7>_1-CT.)x;J] 


VT 


+ [4tr[(C7>_i  -cr.  )xTj(In  . - Px_  .)S0j]2} 


(U'li  ere  5mm(j)  is  the  diagonal  element  of 


=.E  (P4,e-34)  E E«iU(j) 

i € I j ^ i m=l 


48 


+ 2<E  E.tr[X^(!  -PS  )Xoj(C-_i-CrjxTj(I  -P5  )Xoj 

1 £ 11  * 1 j J !J 


Ij# 


(67, -i -£■.)] 


+ E E E.4tr[(C7  i-CTJxJjd n.-P x .^cjjtrCCC-.i-CrJxT 

i 6 I j ^ i£  # i 

dn£-pXl£)^] 


= E 04,e~3<4)  E E^mrn(j) 

i6l  j ^ i m=l 


♦HE  £.  ti'KJj(iI.j-Pxlj)!!oj(C7,-i-cr.)s0Tjanj-Pxlj)s„j 


+ 4E  {tr[C7  i(C7  i-cr.)]}! 

l 6 I 


< (^4,e-3<Tl)E  E E^mm(j)  +<4  E {2ra}2 
i € I j ^ i m=l  i G I 


+24  E tr-cc  i(C7  i-cr.ic  .(C7  i-cr.)] 

i G I 


(c-.i-cr.)] 


<(/ie4-3<T^)  E E E*mm(j)  +4a4er3 


i G I j ^ i m=l 


'eA  a 


+24iEtr[c1).ic-_iclj_igr>.i-2Clj.ic7>.icli.igr.+g1;.icr.clj_icr.] 


“(Pe.j-S17!)  E E E <■■(  j)  +4»Jr| 

l € I j ^ iin=l 


+ 2^Ei{tr[Clj.i-C-_i]-2tr[gi>.1-Cr.]+tr[Clj.iCr.] 
-tr[Cr.5ji(lni-%.)+tr[xJi(In.-P$i.)Xoig7.xT(In.-P?  )Xoigr.] 


49 


( /^4 , e 3<Te)  E E E ^mm(  J ) + 4<rera 

i G Ij  ^ i m=l 


+ 2<4  E {rank[C  _jC  _j] -rankfCj.gj.  ] + tr[C1>xJi(In. -Px  .)XQi] 
i G I ’ ’ i-li 


+ tr[Cr.X0Ti(In.-PSi.)XoiCr.xJi(I„1-Pxii)5oi]) 


< (/i4,e-3<Te)E  E E ^rnm(  j) +4<Ter 
i G I j ^ i m=l 


4„3 
a 


+ HE  trtCT.xJid  -PX  )Xol] 
i G I 1 ~li 


< (/i4,e-3<Te)E  E E *mm(  J)  + 4<4ra  + 4<rera 

i G I j 7^  i m=l 


< (^e,4-3<Te)  E ! “>ax  max  |«mm(j)|  E tr[(In.-Px  .) 

iGl1<J<kl<m<nj  j^i  J -lj 


*0  j ( CT,  - i ~ GT.  )«o‘  j ( In  j - \ • ) ] + 4<44  + 44ra 


= (P4,e-34)iEI  l<T<kl<;a<„.l<"""(j)l  iSjtrHjjdnj-Pj  > 

ioj(5T,-i-er.)]+44+44rJ„ 

= (p),c-34).Ei  1<*5x<ki<“|11  l‘.m(i)l('-“i‘[cr,.rSi,-i] 
-ra„k[C,.Cr.]  + tr[x3'i(In.-PXi.)X0iCrJ)+44ro(r|  + l) 


<(P4,e-34 


1 i ! 1 <T<  k 1 n ■ 1 ‘“( 1 ) I ■ (In i - PX,  i WoiST.  ] 


50 


+ 4<4ra(ri  + 1) 

<(^,e-3(Te)ra  E “ax  1 I Snm(  J ) I + 4<4ra(ri  + 1 ) 

iel  1 < J < k 1 < m < n j 

= 0(Po)> 


which  proves  the  claim. 

Note  that  ^(e_3(re  is  a multiple  of  the  coefficient  of  kurtosis 
of  e.  If  e is  normally  distributed  this  term  is  zero  and  assumption 
A2(ii)  is  unnecessary.  In  fact  if  e has  negative  coefficient  of 
kurtosis  then  A2  (ii)  is  unnecessary.  Also  for  any  hierarchical 
model  assumption  A2(ii)  is  unnecessary  because  both  Cx  and  Cx>  are 
zero  for  i = 1 , • • • , k . 

It  should  be  pointed  out  that  for  i 0 I 


i?ii<T<ktr[(I"i-Px,i»oj(C7,-i-5r.)5jj(Ini-i‘xll)] 


<E  tr[C  i(5- 

l 6 I 


Thus,  assumption  A2(ii)  is  really  an  assumption  of  order  rfl,  not  k. 

There  is  a representation  for  Tj  jack  based  on  (2.3.3)  which  is 
necessary  for  the  next  lemma. 


51 


nl*Tl,  jack  ~ ni*kTi  ni*(k  1)k  i?1Tl»-i 

i k 

= SSE1-i  £sse1  . 
k i=1 


— ni*Ta 


+ nr fry  EE  (Y.-EY.)TC.f(Y|-EY£) 

k(k-l)  l<j^g<k  J J £ £ 

+ r E EE  (Y  i — EY  i)T[C  •£  - Cjj  _i](Y£-EY£) 

k i = l 1 < j ^L<k  3 J J J ’ 

(^i) 

(2.4.12) 

where,  again,  Qj£  = xJjUoYo)"1^  and  Cj£>_.  = XQ j ( xj( - i ) X0( - i ) )_xj£. 


Lemma  2.4.5  Under  model  (2.1.1)  with  assumptions  A1  and  A2 


'"‘(Wl.j.ck)  J'  0 


Proof.  Again  using  representation  (2.4.12)  it  suffices  to  show  the 
last  three  terms  converge  to  zero,  in  probability,  when  multiplied 
by  >fk. 

As  before  it  is  possible  to  construct  a sequence  of  £0  integers 
t ° 

(^0  — Po)  so  fkat  c°i(E  Yoi  X0i  .)  = col(xjxo)  although  this  sequence  may 
be  different  from  the  previous.  Denote  I0  ={ i ,,•••,  if  } . 

Lfl 


52 


i)  E{[  EE  (Y^EYjTg.^Y^-EY^)]2} 

l<j?££<k  J J J 

= § EE  Eta.-EY/c.^-EY^)]2 

kl<j^£<k  J J JC  t 6 

= 1 EE  tr[C£. (X^xJ.cri  + In  .4)Cj£(Xl£xJ \a\  + I <r2)] 
Kl<j^£<k  J J J J J i 


( max 
1 < j < k 


( max  c 

1 < j <k 


chL(^ii^i)^  + 4)2i  < ? E < ktr[C£jCj£] 

hL(Ei^i)^  + (Te)2E  E trCX^CxJXo)-1^  X (xJXo)-1^ ] 

j=l  £=1  J J 


max 

1 < j <k 


max 

1 < j <k 


chL(^ii^i)^  + 4)2tr[Ip0] 

chL,( ~ii~ii ■E°’e)2Po‘ 


Since  Ep  EE  (Y  • — EY **(Y«  — EY»)  = 0 then  according  to 

1 < J t « <k  J ~J 

assumption  A2(i) 


^i<?#5<kaj'EYj)T5j<a{'EY<)Pj0' 


ii)  E{[£(  EE  )(Yi-EYj)T(Cj4-C.<(-i))(Vf-EY()]2} 

(#i) 

= E E(  EE  )(Yi-EY.)T(C.£-Ci£  .)(¥,  — EY,) 

i=l  i'=l  1 < j / £ < k J 1 1 1 

(#i) 

(*i') 


2E  E EE  E(y •-EV.)T(cjj-c jt  .jKYj-ev,) 
i=l  i'=ll<  j#!<k  3 3 3 3 ’ 

< * i’1')  (V{  - EY{)T(C£  j -C£J , j - EY  j) 


2E  E ( EE  )tr[(XljxJj<rJ  + In.4)(Ci(-Cit  _i) 
i = l i'=l  1 < j <k  J 1J  j J1  J',  i 

<**>  (slt?Vi+i„,4)(c£j-Cjjj_i)] 


S2(  1<T<k(^i^)”;+4)2ifn?1^<?^<k)tr[(6j'-6j«.-l) 

('5“.!'“)  (Slj-Sy,.,.)] 


k k 


= 2<  1<"f‘<k(x1i5J'i)4+4)2E  E ( EE  JffsJjXjlxJx,,)-1 

1^1^k  1 = 1 i'=l  1 < j # £ < k \ J JV 

— (^o(“i)^o(~i) )_)^o£~0£(( -o^o)_1  ~ (^o  (~i/)^o(“^#) )_)^ 

~2(  i<miX<k(-ii'^)^  + £r2e)2i?1  Jitr^o(-i)X0(-i)((xJxo)-1 

-(xT(-i)Xo(-i))-)xT(-i0V-i0((xJxo)-1-(xT(-i')Xo(-i'))- 


2(  1 <m!X<k(^i* )^  + <re)2{|i  EM^X^^ 


Tv  \-1yT  y /yTy 


k k 


+ 2£  .^^[^^(xTxj-^xTc-iOXoC-iOCCxTc-iOXoC-iOr-ip^ 


k k 


+ .£  E tr[(I  -X01(-i)X0(-i)(xT(-i)X0(-i))-)(I p -xT(-i/)X0(-i') 
1“1  1—1  U 


(xjc-ioxoc-ior)]} 


54 


= 2(  max  (X  ixJ'i)cr2  + £r2)2{tr[T  ] +2^  tr[ (xj( -i  )X0( -i  ) ) 

1 < 1 < K j.  ^ i;=l 

(Xj(i)X0(-i))--I  ]+£  E tr(I  -xj(-i)x0(-i)(xj(-i) 

F°  i=l  i'=l  F° 

X0(-i)r)(Ip0-xJ(-i,)Xo(-i,)(x3'(-i/)X0(-i')r)]} 

<2(  max  (X  + )2{p  + ^ £ tr[Ip  ]} 

1<1<k  iel0i#€l0 

<2(  max  (XixJi)<7?  + (4)2{p0  + p3}. 

1 < l < k 11 

Again,  by  assumption  A2(i),  and  the  fact  that  the  expected  value  is 
zero,  the  term 

All  (1<^<k,(Yj-EYj)T(5J«-Cje>-iKt-^) 

(#M0 

converges  in  probability  to  zero. 


iii)  Now  consider 


= ie^Jfi(-j“EYJ)^ r(cjj>.i-cjj,(vj-EVj) 


+iAj?i(Yj"EYj)T(5Jj--i“6jj)(Yj"EYJ) 


For  each  i * I0,  is  nonnegative  definite  according  to 


55 


Henderson  and  Searle  (1981)  because  the  condition  col(X(J^X0|) 
Ccol(X^  _•  XQ  _•)  is  satisfied.  Therefore  the  following  is  a 
nonnegative  function. 


E (Yj-EYj)'(Cjj,.i-CjJ)(VJ-EYJ) 
1 £ 10  Jr1 


Hence  E< 


{,?  i5i«j-EVj)T(CjJ,.i-CJj)(Yj-EYj)} 
1 £ A0  Jr1 


-(  i<",f<kchL(XiiXj'i)^  + 4).i:i  X.trCCjj.-i-Sjj] 


0Ioj# 


= ( 


l<mfx<kchL(XlixJ'i)^  + 4).  £ tr[(xJ>_iX0>_i)((xJ>_iX0>_.)-1 

— - 1 £ 1q 


-(5o?o)"')] 


= < ! mr<kchL(X,iXTi)„f  + 4)  p tr[lpo-lpo  + x;ixoi(x3jt0)-1] 

— — 1 £ 1q 


fTv  \-l- 


< ( i ^kchL(XlixT.)^  + a2)^tr[xT  ^.(xTXq)-1] 

= ( i<m!x<kchL(^i^i)<r?+^)trCJPo] 

= ( 1<mfX<kchL^ii^i)<T?  + 4)Po- 


Thus 

A . £ .E.(YrEY.)T(C..  _i-Cii)(Y.-EY.) 

converges  to  zero  in  probability. 


56 


For  £ E.(Yj-EYj)T(Cjjj_i-Cjj)(Yj-EYj)  a different 

i £ ^0  J ^ i 

argument  is  necessary. 


iFi  E{Ci5i(Yj-EYj) ^jj,-i-Sjj)(vEtj)]2> 

1 t Ao  Jf  1 


= ( EE  )E(Yi-EY.)T(C..  .i-Cii)(Y.-EY.) 
i<j^e<k  J J JJ’  jj  J J 


(yt-E)Tl),(Sll>.i-Stt)«|-EYl)} 


I0j# 


+ 24tr(Cjj>_i-CJj)(Cjj(-i)-CJj)] 


+ 44tr(Cjj!_i-CjJ)XlixJ1(Cjj(-i)-CJj)] 


+ 24tr[xT.(cjjj.i-Cjj)Xlix;i(Cjj(-i)-CjJ)Xlj] 


”F  ( ^4  ? e 3<7e)  ^E^dg  + (P4  ? j — 3ffi)  E^g 


+ (1<?^  <k)tr[(gjj»-i_Cjj)('li^i<T?  + <T2eJnj)] 


tr[(C££>_i-C££)(Xl£X^+In^|)]} 

(where  d£  is  the  ith  diagonal  element  of  C-.  .-C-.  and  6f  is  the  Lth 

diagonal  element  of  xE(C--  •—  C • -IX  • . 1 

lFi0{j5i[2(  i<T<kchds1i*u)d+4)Jtr[(6jj,.i-sjj)(gJj>.i-5Jj)] 


57 


+ 04,e-3<4)  E d£  +(/i4,i-3(Ti)  E*? 

+ E E tr[(C..  .i-Cii)((XlixJ’i)<r?  + In  4)] 
j ^ 1£  ^ i JJ’  JJ  J 

trH(^,-i-C££)  (Xl£xT^  + in^|)]} 


^Note  that  if  the  coefficient  of  kurtosis  is  not  positive  for  either 
of  the  random  components  then  that  term  can  be  ignored. j 


< E {E  [2(  i<m|x<kchL(XlixT)^  + 4)2tr[(Cjj5_i-Cjj)(Cjj>_i-Cjj)] 

1 & A0J  T-  1 — — 

Dj 

+ 0(1)(  max  chL(X  jXj'.))2  £ d£ 

1 < l < k u £=i  c 


+ (E.tr[(Cjjj_i-Cjj)](  1<^x<kchL(XlixT.)4  + 4))2} 


“iFl  {i5iC2(  1 <i  < kChL(^ii^i)<ri + <Te)2trC(Cjj?_i -Cjj)(CJj_i -Cjj 


€i0j# 


i)] 


+ 0(1)(  1<mfx<kchL(XlixJ'.))2  X^d2 


+ ( 1 < iX<kchL^ii^i)°'i  + <Te)2(2Po)2} 


iFlo  j^i^  l<mfX<kChL^ii^i)^  + ^)2trt(ejj,_i-Cjj)CJj5_i-Cjj 


)] 


+ °(1)(  ls«f>‘skc'>L(S,i5u))2i£  j"^il<"f|n.dfltr(Sjj,.i-CjJ]|} 

+ 4(  i<iX<kchL^ii^i)(Ti  + tTe)2po 


58 


= 2(  l<i<kChLaii^i)'’’  + ^)i1fi  j5itr[(Cjj,-i'Cjj)(5jj.-i_5jj 


o J # 


+ 0(1)(  1 <Pfx<kchL(XlixJ'i)<r?  + 4)2Po 


)] 


<2(  i kchL(XlixJ'i)<r2  + (T|)2  + 0(l)(  i + 


,-T  \ _2  , _2  \2_3 


which  is  0(’4k)  by  assumption  A2(i).  Thus,  finally  putting  everything 
together 


converges  to  zero  in  probability. 


Theorem  2.4.2  Under  model  (2.1.1)  with  assumptions  A1  and  A2 


^ y2/  ( 1Jack 


T • 


-E 


ljack  d 


T2 > jack,/  1^2, jack/ 


N2(Q,  I2) 


Proof . Combining  Theorem  2.2.1  with  Lemma  2.3.4  and  Lemma  2.3.5 
gives  the  result  by  applying  Slutsky’s  Theorem. 


Theorem  2.4.3  Assume  that  g is  a real  valued  function  of  two 
variables,  g:R2  >R,  with  continuous  second  partial  derivatives  which 


I 2 > 

l +n1*<Te' 


are  continuous  in  some  neighborhood  of  I **  I.  Again,  under 

\ ) 

model  (2.1.1)  with  assumptions  A1  and  A2 


59 


2 2 


^{g(Tl,jack’  T2, jack)  S(<Tl  + n^e > °e))  N(0’  £ ?1SigjVij) 

% ^ J 


where  g l = ('l +Bj?4’4)  > §2  = O?  + n^4>4)  * and  vij  is  the  C1 » J) 

element  of  V. 


Proof . This  a consequence  of  Theorem  2.3.2  and  the  delta  method.  A 
Taylor  expansion  of  g(Tx  Jack’  T2 , jack)  in  a neighborhood  of 


gives 


g(^i,jack’  ^2, jack) 


= gO?  + 


2 

e ’ 


i dg/_2  , 1 _2  2vt  / -2  i 1 _2  \ \ 

+ <9x(  1+n7*^’  <Te)(Ti,jack  ("j  + n^e)) 


^g/„2  , 1 2 „2 


+ ^(<Tl  + nf^<7e’  ffe)(T2,jack  ae) 


d2g 


+ a(<,.  '2)(t,jack-(d+4^l))2 


+ ^Kt.jack-I)2 


a2g 


+ 2dxdy(€l  ’ f2)  (Ti , jack  ~ <>i  + n^l ) ) ( T2 , jac^  ~ °f 

ti .\  # _ /of  + jj 

where  L 1 is  on  a line  segment  between  0 and  I 

v 2/  \ <r 


I ; 


60 


Thus  <k[g(TlJack,  T2jack)-g(^  + 1IL4.  4)] 


= Jkg 


Ox 


(<Tl+nj^cre>  <re)(Ti,jack  + n^e) ) 


+ 1^y(,Tl  + nJ*,Te>  4)(T2,jack  4) 


+ «&('.'  <2>(T,,  jack”  <4+11^4)) 


2 , 1 J1  \ \2 


dx 


+ 2dydx^Cl  ’ f 2 ) ( Ti , jack  (al +n^e)  ) (T2,  jack  ffe) 


+ 5y2(el’  e2)(T2,jack  ^e)2} 


The  claim  will  follow  from  Theorem  2.3.2  if  this  last  term  can  be 
shown  to  converge  to  zero  in  probability. 

Since  g has  continuous  second  partials  in  a neighborhood  of 

(_2  . 1 2 \ 

°T  + n.a^e  \ 

„ , and  is  thence  bounded,  it  suffices  to  show  that 

°e  / 


2 , 1 „2\n2  P 


*>  ',Cni>ck'<4  + ir774)i 


“)  jack-4) 


2 \2  P 


whence  we  get  the  cross  product  term 


m)  >lk(T1}jack  (^l  + n^o-l ))(T2j  jack  cr|)  i 0 


P 


61 


by  Cauchy  Schwarz. 

Using  Theorem  2.3.2  observe  that 

^(Ti  Jack-^  + nTTe))  ^ N(°’  Vn> 

and 

^(T2,jack-‘'e)iN(0,  v22) 

Thus  k(T.,jack-<d  + nTr!e))2  - xVlP 

and  k<T2,  Jackie)2  ~ X2(v22)- 

Apply  Slutsky’s  theorem  to 


^(Ti,jack  (<Tl+n7^l))2-^k(Ti,jack  Oi  + nJ*<4)) 


2 . 1 Jl  \ \2 


and 


^(T2jack-4)2  = ^k<T2,jack-4)2 


which  shows  both  terms  converge  to  0 in  probability. 

Therefore  ^[g(TlJack;  T, >jack) + , 4)1 


= ^(4+11^4’  °’®)(Ti,jack  (°l  + 4*4)) 


2 . 1 _2 


+ ^y(4 + n{*4>  <7e)(T2,  jack  4) 
+ 0P(^}- 


62 


According  to  Theorem  2.3.2 


t2_l  1 Ji  „2\.  dg/.2  i 1 .2  _2  \ \ 

ni*<re’  ?e'’  dy^1  + ’ °® 


*1 , jack 
r 

2, jack 


(4+H^e)l 

4 j 


- N2(0;  (gl,  g2)Y(g) 

which  proves  the  theorem. 

It  should  be  pointed  out  that  g(Tj  ,jack’T2,jack)  is  not  the 
jackknife  estimator  of  g(o"i  + n~4»  4)’  which  will  be  denoted  by 
g jackC^i  >T2) . However,  asymptotically  they  are  equivalent.  The  proof 
of  this  depends  on  the  fact  that  the  jackknife  estimator  of  variance 
converges  in  probability  and  will  be  postponed  until  after  this  last 
step  is  proven. 

While  the  jackknife  does  have  an  important  function  in 
improving  estimates  the  real  beauty  of  the  jackknife  technique  is 
that  it  also  can  be  used  to  provide  an  estimate  of  the  variance  of 
the  estimator  of  interest.  In  other  words,  the  jackknife  can  be 
used  to  estimate  the  variance  of  the  variance  estimator.  The  form 
of  the  estimate  of  variance  is  given  by 


1 

and 


(k~l)IZ(r~  ~ SSE  l^^J—SSEj  _.)s 

i=lV  1 kj=l  1,-J  J 

(k-l)E(rr^-  SSE i -i— rSr-^SSE  J2 

i=lV  2,-1  1 kj=i  2,-j  2,-j 


(2.4.13a) 

(2.4.13b) 


Lemma  2.4.6  Under  model  (2.1.1)  with  assumptions  A1  and  A2 


63 


i)  k^SE^^SSE,^ 
and 


ii)  k’C^-iEr^SSE^j 


)2  Z 0 


)2  £ 0 


Proof.  i)  r-1— SSE1  iY 

Lri  i kjTiri,-J  1?'J 

= ft)2[SSE-  -Ij4SSE. . - J + Ejl ( 1 - 'TjTj)SSEi • - i ]! 

< K.kT-^t^E  1,  jt|E  (i-AlssE^jf 

K j = 1 ? J Kj  = 1 1,"J  ’ J 


, jack  - T.  > + tSSEi  + it  < 1 - r^)SSE, , - j ]! 

J-l  ’ J 


= [”..(T1,jack-T.)+T(kT>-¥21T.,-j-T') 


+ TT>+i£<1+E-rfL^SSEi  - j]’ 

K Kj=l  K 1»~J  ’ J 


<4n1,(^)2(Tljjack-T1)H^T;  + r[^£(^-rfi1)SSE1>j] 

J — 1 ’ ^ 


2 i 4ni*T2  i _rlf>  /k+1  rl 


By  Lemma  2.3.5  (Tj  jacj.  — Tj)2  = Op(i)  and  by  Theorem  2.2.1  ^4jT2  is 

°P(k^) 


1 t2  ; . 


Thus  consider  only 


i k 

rtE( 

kj=i 


k+1 

k 


)SSE 

l.-J 


2 


<4[l|i(M-^)(SSE1.j-SSE0]2  + 4(^[|i(^-I1^)] 


^[|/i-:^M)(sSE,,.j-SSE,)]^Op(l)[gi  -’-4  •] 


k (k+l)r1?_j-kr! 


According  to  Lemma  2.3.5  i),  ii),  and  iii)  the  first  term  is 


^[J:/i-I1n7r3zi)«j-EYj)T(inj-cJj)(Yj-EYJ)]-+o(iiJ)=o(i) 


by  assumption  A2(ii). 


k (k+1 )r  --krx  k 

[E ] — TT, 1 rj(E((k+l)r  ,-kr, 

k ( 1 “ax  , r,  _i)  j=l  ’ J 

1 < l < k 1 J 


j=i  kr.,-j 


= 0(l)i((k+l)Etr[xT  -P  )X  j]-kV,)! 

K i=l  ’ -J  J -0,-1  ’ J 


0(1) 


((k+l)(k-l)tr[xJ'xi]-(k+l)X;  trCxJ1  _-Px  X _•] 


- k2tr[xjxx]  + k2tr[xjpx  Jj])5 


— 0 ( 1 ) /•  4.^rvTv  ui.2t.rYT 


^^■(“trCXjXj]  + k2tr[Xj  Px  Xx]  — (k+1)  tr[X*  -Px  X 
k oo  j=1  ’ J -o,-i 


Note 


that  (k+1)  £ tr[X^  -Px  X.  .]  - k2tr[X^Px  XJ 
j_l  1 ’ J -o , - i 1 ’ J -o 


= (Ul)|itr[xJiJ.0P!!oj_oX1>J_0]-kHr[xTp5oX1] 
= (k+1  ).EtP[X]',  j.0(px’ J o - Px0>t , j-„] 


65 


Thus 

ii) 


+ (k+lJ^trCXj , j_oPX0^i,  j-0]  -k2tr[^PX0?i] 

(k+1)|itrC^ij_0(pxo?i._o-pXo)^,iro] 

+ ( k+! ) ( tr  [ xj,  j -oPX0j_0~i , j - 0]  - tr  C *1 , j -oPX0-i  3 ) 


+ (k+l)£  tr[Xx  • 0Py  Xa] -k2tr[X^Py  Xj] 
* J u 

(k+l)|itr[xTi._0(P5oi^o-P5o)X1,i._0] 

+ (k+l)Etr[xT  j_0Px0(S1,j_0-X1)] 

+ (k+l)(k-l)tr[xJ'Px  Xj]  — k2tr[X^Px  Xx] 


Xo- 


X0- 


(k+l)Etr[[xT  j ._0(PX  . -Pv  )Xj  i._0] 

j=l  1,:LJ  0 -oi.-o  0 


+ (k+lJtrKfp^Xj]  - trtxfp^Xj] 


= (k+l)|;  trtxT  (P X -Px0)S,,i  ,-cl+ktrCxTp^X,]. 

J-l  J j J 


the  last  term  is  0(r:)  which  proves  part  (i) 


= (i)^(SSE2-l|i^i-SSE2i.j)^ 


66 


k \2r„  n2^(k  1 ) 


J-1  J-1  ’ J 

< [n2*(T2,  jack_T2)  +n2*T2  + 5;.I)  (!  -r  \ -)SSE2,- 

j_i  ’ J 

= ["»(T2Jack-T2)+fe(T2Jack-T2)+ii:(^I-r^rj)SSE2j_j]i 

J_  I ’ J 


,(n2*) 


<4i^*(T,>jKk-T,)]*(A)*  + 4[i£  (*-*7^)888.,^]*. 

J_1  ? J 


The  first  terra  is  Op(^)  by  Lemma  2.3.4. 


N°“  CijS  (l^T  - > SSE2 . - j T 


r \ -\2 


)]' 


k kr. 


r.  - r ccp  k Kr0  • — (k-l)r2 

(kA)r  ■ ■:  -3a 

J_1  ’ J J_1  v ' 2 , — J 

(2.4.14) 


According  to  Lemma  2.3.4  (i),  (ii),  and  (iii),  the  first  term  on  the 
right  hand  side  of  (2.4.14)  is 


+ 0(1) 
= 0(1) 


again,  by  assumption  A2(ii).  Next  consider  the  second  term  on  the 
right  hand  side  of  (2.4.14). 


>1  (k-Dr^.j  J 

k(n. -n.) -ktrCPy  .)  - (k-l)n . + (k-1  )tr(Pu  ) 

J ~2 , - j ~2  ■ 


= [£ 
j=i 


(k-l)2(n>  — rank(Xj)) 


j[k(k-l)n. — k(k-l )n . + (k-^ktr^  ) 

-kEtrP  .)]2 

j=l  2»-J 


(k-l)2(n.  — rank(Xx)) 


— I[k(k-l)tr(Pu  ) -k^  tr(Pu  .)] 


2 .1=1  2.-J 


Ck3i)27 1./T  „a[(k-i)tr(iv)-kE  tr(u 2 j)]2 

K 1 (n^  — rank(Xj))  =2  **  J 


= 0(A)[  £ (trLPy  ]-tr[PLI  J)]2 
k iel  2 2>-i 


<0(^)[£o(Pi  + Po)]2 

= o(i). 


This  proves  part  (ii). 


Theorem  2.4.4:  Under  model  (2.1.1)  with  assumptions  A1  and  A2 


i) 


(k-l)E  ( 
j=l 


r 


— — SSE 
i,-J 


-isSEx)2 


Pr  vn 

(nl*)2 


68 


and 

ii)  (k-l)E  (r~^  :SSE  -^SSE2)2  *3-?™ 

j=l  2,-j  *■>  J 2 ( n2j(c )* 


Proof . i)  First  apply  representations  (2.3.3)  and  (2.4.5)  to  the 
term 


rvTISSEi , - i " ^SSE1  - rjCsi  ( Yi ) - : Egl  ( Yi ) ] 


-(r^-i)jEi(Yj-EYj)Uin.-Cjj)(Yj-EYj) 

^jEi(Yj-EYj)T(Cjj-cjj?-i)(Yj-EYj) 


+ ^E.(Yj-EYj)Tcji(Yi-EYi) 


-(rr-^-rr)  EE  (Yi -ey.^c.^-ey^) 

1,-1  l<j  #£<k  J J e £ 

(#i) 

-rT^T  EE  (YrEYi)T(Cj£  _i-Cjfi)(Y|-EY£) 
(#i) 

Now  use  the  parts  of  Lemma  2.3.5  to  apply  to 


(k-l)E  ^(r~  rSSEj  - E SSE,  - 1 [gjUj) -Eg^Yj)]}2 

1=1  1 5 “ 1 ’ 1 1 

= (k-i).i:-- "2r21,-l)  E(E  (Yj-EYj)T(In.-Cjj)(Yj-EYj))2 

1=1  ri1i,-i  j/i  j j j jj  j j 

+ (k-l)E-1  2E(E  (Yi-EY:)T(In.-Ci  i i ) ( Y i — E Y ; ) ) " 
i=l(ri,_i)  j # i J J J J,J’  1 J J 


69 


+ (k-l)^  £E(£  (Y i — EY*)TCji(Yj  — EYj 
r!  i=l  j/i  J J J J J 


i))5 


+ (k-1 ) £ 2 21’  1 E(  £ £ (Y-.  - EY  • )TC  .c)  (Ye  - EYe))2 

i=l  riri  -i  l<j?M<k  J J 

( ^ i) 

+ (k-l)£.  1 2E(  £ £ (Yj-EY.^C^-Cj*  i)(Y£-EYg))2 

i = l(r!  _i)  l<j^C<k  J J J J ’ 

(#i) 

+ cross  product  terms. 


(ri  — ri  _i)2  o( p?) 

Note  from  the  proof  of  the  previous  lemma  — 5- - — -3—  = — r-j — • This 

rl(rl,-i)  k 

together  with  assumption  A2(ii)  imply  the  first  term  in  the  previous 
expression  is  O(^).  According  to  part  (ii)  of  lemma  2.3.5  the 
second  term  is  o(^).  According  to  part  (i)  of  lemma  2.3.5  the  third 
term  is  o(^) . Combining  to  above  note  with  the  aforementioned  part 
(i)  of  Lemma  2.3.5  the  fourth  term  is  also  o(j^).  Finally,  part 
(iii)  of  Lemma  2.3.5  implies  the  fifth  term  is  O(^).  Apply  Cauchy 
Schwarz  to  the  cross  product  terms  to  get  that  the  entire  quantity 
is  0(i). 

Next  we  consider  the  sum 

(k-l)£{^[g1(Yi)-Eg1(Yi)]}2  = i^£  {g1(Yi)-Eg1(Yi)}2 
i=l  1 rii  = l 1 

= 7^rj|1<*i(Yi)-ESi(Vi)}! 


Now  E I £{g1(Yi)-Eg1(Yi)}2 
K i=l  1 


70 


= rEVar(g1(Yi)) 

Ki  = l 

= ^Var(Eg1(Yi)) 
k i = l 

= iVar(SSE1+Eg1(Yi)-SSE1) 
k i=l 

i k 

= ±{  Var ( SSEj ) + 2Cov(  SSEj ; Egl  ( y.  ) + Var ( £ gl  ( Y - ) - SSE! ) } 


By  Lemma  2.2.2 


= iVar(SSE1)  + 0(I) 
= vn  + o(l) 


By  Theorem  B(i)  p.  275  of  Loeve  (1963)  is(g1(Yi -Eg1(Yi  ))2  converges 
in  probability  to  vu.  Since  ^ — ► n1>(e  then 


(k-l)E  {r^SSEj  .-I-SSEJ 
i=l  l.-i  ’ 1 


p 

2 1 r 


v 


11 


(nl*) 


2 ‘ 


For  part  (ii)  apply  representations  (2.3.4)  and  (2.4.6)  to  the  term 


-(r^-n]5i«j-^j)Ta»J-Pxlj)5„jCrjJj(I„j-Px1.)(Yj-f:Yj) 


71 


+ (i 


T)jz:i(Yj-EYJ)Ta„J-pxlj)x0j[Gr. -cr^i3xJJ«n.-pSii) 


~ij 


(Yj-EYj) 


E 


(Yj-BlTj^do. 


)(Yj-EYj) 


-( 


EE  (Y- -EY -)T 
1 < j 7^  £ < k J J 

(#i) 


) 


(Y€9-ey£) 


EE  (Y.-EY.^U 

1 < J #«<k  J J 

(#i) 


)xoj[cr.-C7.]x0T£a 


) 


(Y£,-ey£) 


As  before  apply  Lemma  2.3.4  to  the  sum 


(k-l)  EEfrr-rSSB  i-iSSEj-J-Cg.fYp-Eg.fV;)]}1 

1=1  ’ 1 z 


(k-i)£  r (Yj-Ey.)‘(!  p ) ( v . - Ey j 

1 = 1 2?~1  i j ^ i J J J -1J  J J 


)f 


(Yj-EYj)]2 

+ (k-l)  Er=t  E (Yj-EYj)T(In  - Px  .)xt(In.-Px  .) 
1 = 1 2,-l  J ± 1 J J J -1J  J J -11 


72 


(Yj-EVj)]* 


+ (k-l)£4[  .Ettj-EljPUn  -Px  WojCr.xJiU,,  -px  ) 

l = lr2  J ^ 1 J J J -1J  J 1-11 


(Yi-EYj)]2 


+ (k-i)E  (rr^-r-yi  EE  (y.-ey-)^!  -px  .)xoj 

i = l 2,-1  2 1 < j ^:  £ < k J J J -1J  0J 

(#i) 


r-  xT 
W . he 


+ (k-l)Err-r[  EE  (Yj-EY.)T(I  -px  .)Xoj[CT.-C7.]x; 
i=i  2,-1  i< j^e<k  J J J -ij  J 

(*i) 

(!«e-%e)(Ye-EYe)]2 


T 
o e 


+ cross  product  terms. 

Again  for  the  previous  lemma  (jr-^ 

2 9 

assumption  A2(ii)  this  implies  the  first  two  terms  are  O(j^). 
According  to  part  (iii)  of  Lemma  2.3.4  the  third  term  is  °(j“)- 
According  to  part  (i)  of  Lemma  2.3.4  the  fourth  term  is  °(j“)- 
Combining  the  above  note  with  part  (i)  of  Lemma  2.3.4  shows  the 
penultimate  term  is  o(^-).  Finally  the  last  term  is  o(^)  by  Lemma 
2.3.4  part  (ii).  Of  course  all  cross  product  terms  are  o(j^)  by 
Cauchy  Schwarz. 


0(P0) 


and  together  with 


Next  consider  the  sum 


73 


(k-l)E  {^[g2(Yi)-Eg2(Yi)]}2  = ^ E {g2(Yi)-Eg2(Yi)}2 


i=l 


r2  i = l 


^i^E{g2(Yi)-Eg2(Yi)}2 


Now  E±E  {g2(Yi)-Eg2(Yi)}2 

= nE  Yar(g2(Yi)) 

*i=l 


iVar(  E g2(Yi)) 

k i = l 


:kar(SSE2+Eg2(Yi)-SSE2) 
K i = l 


= ±Var(SSE2)  +^Cov(SSE2 ; ( E g2(Yi ) - SSE2) ) + IVar(SSE2  - £ g^Yj  ) ) 

K K i = l K i = l 


= v22  + o(l). 


By  theorem  B(i)  p.  275  of  Loeve  (1963)  rE  (g2(Y- ) - Eg2(Y- ) )' 

*i=l  ' '1 

converges  in  probability  to  v22.  Since  —>  n2*  then 


Thus  the  theorem  is  proven. 

We  are  now  in  a position  to  prove  that  the  jackknife  estimator, 
Sjack(Tl’T2)>  is  asymptotically  equivalent  to  g( Y jack ’T2 . jack^ ’ but 


the  proof  depends  on  the  following  technical  lemma. 

Lemma  2.4,6  Under  model  (2.1.1)  with  assumptions  A1 , A2,  and  A3 


74 


i) 


max  I SSE, — SSE  • 
1 < * < k 1 1*"1 


0p«iO 


(5 


ii) 


max  I SSE,  — SSE„ 

1 < i < k 1 2 


Proof . i)  According  to  (2.4.12) 


Op('Jk) 


(S 


SSE, 


1 - SSEl , . i = (Yi  - EYj  )T(  I„ . - P5o . ) (Yi  - EY . ) 


+ EE  (Yi-EYj)Tcit(Y(-EV,) 


+ EE  (Y1-EYi)T[Ci£-Ci£  .iKYj-EYg) 

1 < j ^ < k J J 1 

(#i) 

+ .E.(Yj-EYj)T[CJji_i-Cjj](Yj-EYJ) 


By  part  i ) of  Lemma  2.4.5  we  see  that 


E E 

1< j^£<k 


(Yj  — EYj)TCjg(Yg  — EY^)  = 0(pq) 


By  part  ii)  of  Lemma  2.4.5  we  see  that 


EE  (Yi-EYi)T[Ci£-Ci£  _i](Y£-EY£)  = 0(p5) 

1 < j #£<k  J J J J ’ £ 

(*i) 


By  part  i i i ) of  lemma  2.4.5  we  see  that 


E 

j/i 


(Yj_EYj)T[Cjj,_i  -CjjKYj-EYj)  = 0(po) 


.4.15a) 

.4.15b) 


Thus 


75 


SSEj  - SSE1  f _j  = (Y.  - EYj )T( I„.  - P^. ) (Yj  - E*. ) + 0(pg) 
Now  by  Chebychev’s  inequality,  for  any  e > 0 

K , |?|k(Yi-EVi)T(lnj-Pxoi)(Yi -EYi»Wk) 

< .E  ^E«i  - EYi  )T( I„.  - P?o.  KYj  -EYj) 

= 0(1), 

by  Al. 

ii)  According  to  (2.4.10) 


SSE 


2-SSE2i.i  = (Yi  -EYi)T(I„.  -Pj  KYi  -EYj) 


l<SSik(YrEYj),(Inj-pxlj)Sojcr.xi(!„rPxIt)(Yt-EY<) 


+ E.(YrEY{)T(i„/xl()Soj(5r,-iCr.)xT(int-Ps  )(Y4-EY£) 

*■  r 1 


i+<S^<k<^Eb)'(i"j-pxlj)5oj(C7-rcr.)^(inrpx1()(v{-EY{) 

(#i) 

By  part  i)  of  Lemma  2.4.4  we  see  that 


i<F^<k(Yj"EYj)T<Jnj'PS.j)?ojCrj"‘(!"<“Px,f)(v«“EY«,  = 0<p») 


By  part  ii)  of  Lemma  2.4.4  we  see  that 


76 


5/~«_EY£)T(In£pXlg)^0£(C7,-i5r.  )£o£(Jn^  ~ PX, ,,)(¥£  ~EY£)  = 0(pq) 


«#  i 


£ -i£ 


By  part  iii)  of  lemma  2.4.4  we  see  that 

?E,JYJ-EYpT(In.-Pxi.)Xoj(C-_rCrjxJf(InrP?|i,)(V{-EV,)=0(p§) 


1 < j ^£<k 

(*i) 


Thus 


SSE2-SSE2>_i  = (Yi-EYi)T(In.-Pxii)(Yi-EYi)+0(p5) 


Now  by  Chebychev’s  inequality,  for  any  e > 0 


K 1-k(Yi-EYi)T(Ini-PSii)(Yi-EYi)>£^) 


<.E  ^(Vj -EVptlnj -PSi.  )(Yi -EY;) 


= 0(1), 

by  A1 . This  proves  the  lemma. 

Theorem  2.4.5  Assume  that  g is  a real  valued  function  of  two 
variables,  g:R2— +R,  with  continuous  second  partial  derivatives  which 

. Again,  under 


/_2  , 1 „2 

/01  + ni  ae 

are  continuous  in  some  neighborhood  of  I 9 


model  (2.1.1)  with  assumptions  A1 , A2  and  A3 


^<8 jackal’  T2)-S(^i+H^2e5  4))  - N(0,  £ £ g^v-.) 

i*  i=i  i=1  J iJ 


2 2 


where  Si  = (4  + n^4>4)  » §2  = ^(4  + ii^4>4)  . and  v-  • is  the  (i,j) 


ij 


77 


element  of  V. 

Proof . This  will  follow  from  Theorem  2.3.3  if  it  can  be  shown  that 

^(gjack(Tl’T2)-g(Ti,jack’  T2,jack))  ^ °‘  To  do  this  a Taylor 
expansion  of  both  functions  will  be  used.  In  a fashion  similar  to 

the  proof  of  Theorem  2.3.3  expand  g(Tj  jacji?T2  jacj.)  in  a Taylor 

T 

expansion  in  a neighborhood  of  (Tj,T2)  which  gives 


®(^i,jack’  ^2,jack)  g(T1>T2) 


+ |(T,,T2)(Ti,jack-T.) 


+ f<Tl.T2)(T2jack-T2) 


+ 5|S(‘..<2)(t,jack-T2)i 


ldy 


+ S:('l’'=»T.,jack-T.)(T2,jack-T2)  (2-4.16) 


where  is  on  a line  segment  between  (Tj  ,jack’T2,jack)  and  (Tl  ’ T2 ) * 


Next  observe  that  expanding  g(T.  • ,T„  •)  in  a neighborhood  of 

(Ti  »T2)T » for  1 gives 

® jackal  ’^2) 

= kg(T1,T2)-^liEs(Tlj_i,T2.i) 


78 


= g(T1,T2) [g(T1,.i,T2>.i)-g(T1,T2)] 


:g(T1,T2)-^E[^(T1>T2)(Tlj.i-T1)+^(T1,T2)(T2>.i-T2)] 


+ 


^£[p(Ai,/«i)(Tlj.i-T1)2  + 2^(Ai,/Ii)(Tlj.1-T1)(T2i.i-T2) 


-d2g 


d2g 


+ 0(Ai,,i)(T 


= «(T„TI)+|(T1,TJ)(T1>jack-T1)+|(T1>T2)(T2>jack-T2) 


+ 


^.E[p(Ai,/.i)(T1>_i-T1)2  + 2^(Ai,^)(Tlj_i-T1)(T2>_1-T2) 


•a2g 


d2g 


+ |S(Ai-*i)(TJ,-i-T2)2]- 


(2.4.17) 


A comparison  of  (2.4.16)  and  (2.4.17)  yields 


sjack(Ti,T2)  S(Ti,jack’  T2,jack^ 


f)2e 

+ ^Ai^i)(T2,-i-T2)2] 


+ 25J(€l’  e2)(Ti,  jack“Ti)2  +0(fl’  €2)(T!,  jack_T2) 


ld2g, 

dy2 


d2g 


+ ssy(£l-  <2)(Tl,jack-T1)(T2,jack-T2)-  (2.4.18) 


79 


Since  the  second  partials  of  g are  continuous  and  the  maximum 
difference  of  T.-T,  - and  T,-T„  -converge  to  0 then  the  second 

partials  of  g evaluated  at  (A^,/*-)  are  simultaneously  bounded.  Thus 


according  to  Theorem  2.4.5 

ltlp 

i=l  dx 


d2g 


2l  P 


+^(Ai^i)(T2,-i-T2 n - 0. 


(2.4.19) 


Again,  applying  the  continuity  of  the  second  partials  of  g, 
together  with  the  results  of  Lemmas  2.4.4  and  2.4.5  shows  that 


e2>(t , jack- 1)2  ^Ktjack-T,) 


dx 


dy‘ 


+ £§rX'l,  ^Kt.jaek-tXT^jack-t)}  - <>•  (2.4.20) 


Finally,  combining  (2.4.19)  and  (2.4.20)  implies  that 

"^Tgjack(Tl  ,T2) -g(Ti,  jack’  T2,jack^  ~ 
which  proves  the  theorem. 

The  most  important  example  to  which  this  theorem  applies  is  the 

<T? 

ratio  of  the  variance  components  —4.  The  function  g(x,y)  in  this 

ffe 

case  is  g(x,y)  = y. 


dg_l 

dx~y 


dg_ 

dy 

d2g 

dxdy 


d2g  _ 2x 
dy2  y3 


80 


By  assumption  Al(i)  <r|>0,  so  the  second  partials  of  g are  all 


/_2  _L  1 

I & 1 l & 0 \ 

continuous  in  a neighborhood  of  I **  ).  The  jackknife  estimator 


of  the  ratio,  -y  = (r^/cr^,  is  given  by 


a /ocp  /r  OOP  ,/r  ^ 1-  r2SSEl  k-lf  r2,-iSSEl,-i  1 

®jack(SSEi' ri ,SSE2' r2^  ~ k rxSSE2  k A r,  -SSE,  • + ni*’ 

1 * 1 = 1 1 > -1  2—1 


CHAPTER  THREE 

GENERAL  LINEAR  MODEL  WITH  SEVERAL  VARIANCE  COMPONENTS 
3 . 1 Introduction 


In  the  previous  chapter,  the  problem  of  estimating  functions  of 
two  variance  components  under  a mixed  general  linear  model  was 
considered.  In  this  chapter,  this  problem  will  be  extended  to 
estimating  any  fixed  number  of  variance  components  under  a mixed 
general  linear  model.  All  of  the  previous  theorems  will  be  extended 
to  the  general  case.  The  outline  of  this  chapter  will  be  similar  to 
the  outline  of  chapter  two.  Initially,  the  linear  model  will  be 
developed.  Next,  the  Henderson  method  III  estimator  will  be 
presented.  A central  limit  theorem  for  this  estimator  is  proven.  The 
jackknife  estimate  is  developed  and  shown  to  be  asymptotically 
equivalent  to  the  un jackknifed  estimator.  Finally,  a jackknife 
estimate  of  variance  of  the  estimator  is  derived  and  is  shown  to 
converge  in  probability  to  the  variance  of  the  asymptotic 
distribution . 

It  is  assumed  that  there  are  q+1  variance  components  where  q is 
known.  The  model  for  this  chapter  will  be  written  in  the  form 

¥ = X0/?0  + + ...  + Xq/?q  + e (3.1.1) 


81 


82 


where 

(i)  /?0  is  a p0xl  vector  of  fixed  but  unknown  components, 

(ii)  /?q  and  e are  mutually  independent,  mean  zero,  random 
components  with  finite  4+6  moments  for  some  6>0, 

(iii)  Var(/?j)  = <T?Ip^  for  j = 1,...,  q 
Var(e)  = 

9 9 

For  later  reference  let  <rg  = (7q+.p  ^ = §,  and  Xq+^  = In  . The 

following  assumptions  give  a nested  structure  to  the  Xj  for  j = 

kj 

Al:  i)  For  j = 1,...,  q,  X-  = © X 

J t= 1 

ii)  For  j = 1,...,  q,  col(Xj)  C col(Xj+1) 

iii)  i <“a<  k = °('Po^  f°r  j = 1’-"’  q> 

Assumptions  Al  i)  together  with  Al  ii)  imply  that  the  model  is  what 
is  referred  to  as  nested.  Assumption  Al  iii)  is  a technical 
assumption  which  assures  that  the  terms  SSE j , defined  below,  can  be 
approximated  by  a sequence  of  independent  random  variables  for 
purposes  of  the  central  limit  theorem.  It  should  be  pointed  out  that 
for  most  standard  models,  such  as  nested  error  regression,  the 
maximum  eigenvalue  is  a function  of  cell  size  only,  and  thus  is 
automatically  bounded. 

The  partitioning  of  induces  a partitioning  of  X0  as  well  as 
each  of  the  matrices  X2,...,  Xq.  Corresponding  to  each  block  of  X^  is 
a vector  Yj  so  that 


*i 


,...,  k.  (3.1.2) 


- «o,i£o  + + •••  + *q,i£q,i  + §i 


1 


1 


83 


For  each  j = 1,...,  q+1  there  is  a design  matrix  Uj  = j^X0|...|  Xj^j 
which  has  corresponding  sums  of  squares  SSEj  = Y^I  — Pjj  Jy.  If  we 
denote  C-  = xJfl-P^  JX0,  for  j = 1,...,  q+1,  then  according  to 
Henderson  and  Searle  (1981) 

PU.x1  = PX.  + (in.-Px^oCj-X^In.-Px.).  (3-1.3) 

~j+i  ~j  ~y  J v ~y 

The  following  representation  holds  for  j = 1,...,  q 

SSEj+1  - ,|:i(Yj-EVi)T(lni-P!!.i)(vi-EVi) 

- 1<F^<  fii  - E-i)T(-nj  - 'ijM-t.&t  - %efa  - EY<)- 

(3.1.4a) 


SSE1  = 


|/*i-E¥i)WP^i-EYi) 

- , k(li-EYi)TX0i(xJx<))-X0Ti ^-EY«). 


(3.1.4b) 


3.2  Multivariate  Central  Limit  Theorem 


In  order  to  estimate  q2 , the  vector  of  variance  components,  we 
use  a vector  estimate  q2  = a2,  v2^ , and  it  is  the  goal  of 

this  section  to  show  that  this  vector  satisfies  a multivariate 
central  limit  theorem.  For  j = 1,...,  q+1  and  i = 1,...,  k denote 


84 


gj(Yi)  = e(SSE  j I Yi ) 


= (Yi- 

EViJ^In.  - 

•pX.  .) 

~ j-1 J 1 

(Yi-EYj) 

-E(Yi 

-EVi)T(ln. 

- pY  . . 

-j-1 , i 

^Yi-EYi) 

-(Yi- 

E?i)T(lni- 

SoiC]li,.XoTi(!ni-p5j_1,i)«i-Eb) 

+ ECYj 

-EYi)T(jni 

-px.  • 
-j-1 , i 

+ E(SSEj). 

(3.2.1) 

Lemma  3.2.1  Under  model  (3.1.1)  with  assumption  A1 , for  j = 1,...,  q+1 
e|(sSEj  — E(SSEj))  - ^(gjCYi)  - Egj(Y.))|  =0(po3). 


Proof.  First  observe  that  the  set  of  vectors  {Y-}  for  i = 1,...,  k is 

an  independent  set.  Secondly,  since  C-  is  symmetric  for  j = 1,..., 

~ J • 

q+1  then  Cj~  can  also  be  taken  to  be  symmetric.  Finally,  note  that 

C-  is  idempotent. 

~ J • ~ J • 


(sSEj-EfSSEj))  - J:(gJ(Yi)-Eg.(Yi); 


=E 


\ EE  (Yj-EY^Tfln.  -Px  .)X  iCT  xj/l  -P  ) 

U<i#«<k  v i -j-l,^0’!  J !’•  0’*^  t -j-l,r 

(Yg-EY£)J 

EE  E(Yi-EYi)T(ln.-Px_  k .cr  xj/i  -p  ) 

l<i^£<k  V i ~j-l,i'  0,1  J 1»*  £ -j-l,£^ 

(Y«-ey£) 


-2 


85 


2 EE  tr 
1 < i ^ £ < k 


V-(Vi )(l«i  - PX j.! , jX.,  , .£,£■( -%_!,«) 


vara«)(i„rPx..l!j>0>«cJ-1,.s:>i(i„i-px..li.) 


2 EE  tr 

1 < i ^ £ < k 
.q+1 


r /q+1 


(jC  <’m5mi*mi)(Jni  Pjj  j jX.i-j-li.-o./-11*  PSj-l,£) 


(jC^AXlne-Px..! 


k k 

2E  Etr 

i=l  £=  1 


r /i+i 


(j|1?i<T™'mi~™i)(~ni  PXj_1?iK,i'j-l, -~0,£(5n£  PXj_1>£) 


q+1 

Cl 


5 CV"'  <"“  kChL(^miSli)) 


k k 
E Etr 
i = l £=1 


k k 


- pxJ_1 , iXiC  j-i , ■ 


= 0(Po)E  Etr 

i = l £=1 


*oi(lni-PxJ_1,iXi6J-l,.sUlnrPSj.1,{>o,«5J-l, 


- OfPoJtrfSj.!,.  Cj-l,.  Cj-1,.  Sj-1,.] 
= KPSltrpj-!,.  5J.1,.] 

= 0(Po)rank[Cj_1 5>j 


= 0(Po )P0’ 


86 


The  objective  of  obtaining  a central  limit  theorem  for  the 
vector  of  estimates  cr z = (oj,...,  frgj1  can  now  be  seen  as  equivalent 

to  a central  limit  theorem  for  the  vector  of  projections 

/ k k \T 

yEgiOfi);  •••;  .Egq+iCY^J  . 

^ o / o ao  ^ oVr 

The  Henderson  III  estimates,  a ~ <Tq,  , are  linear 


combinations  of  SSE  = (SSE^,...,  SSECj+-^)^ 

so  we  can 

?2  = «kSSE 

where  is  an  invertible  matrix,  i.e., 

hll  h12 

• hl,q+l 

-k  = 

0 h22 

• h2,q+l 

o 

o 

• hq+l , q+1 

and  hjj  = tr 


Sj(In--PUj>j 


> 0,  for  j=l,...,q+l. 


The  following  assumptions  are  sufficient  to  have  a central 
limit  theorem  for  the  vector  of  estimates. 

A2:  i)  >0. 

ii)  For  j = 1,...,  q+1, 


E rj  = E hjj  = E tr 


".j  > 0 


87 


and  for  £ > j 
h. 


k 


j£ 


hjjh££ 


has  a finite  limit. 


iii)  ^Var(SSE)  — *■  V where  Y is  positive  definite. 

iv)  p^k  < rank^Py  ^ < p-^k  + p0  and  lim^-  < 1. 


v)  -f=  — > 0 as  k — ► oo. 
Nk 


vi 


> 1 <"<%+!  — ' « ra"k(Sj)  “ i rank<Sj-l>}  S ‘ >»• 


Note  that  as  a consequence  of  A2(ii) 


k Hj^1  - h;1 


where,  as  the  notation  implies,  is  invertible,  i.e.,  has 

positive  entries  on  the  diagonal.  Furthermore,  since  i rq+l  = 


Itr 

ktr 


rank^  ) 


in.  - puq+1J  — n*q+i  and  by  A2(v)  lim  k 


exists  then 


has  a limit  which  will  be  denoted  by  n*.  This  is  long  run  average 
cell  size.  Occasionally  it  will  be  convenient  to  work  with  the 
vector  RS  = (^j-SSEj  ,...,— SSEq+^)^.  Note  that  as  a consequence  of 
A ( i i ) , Rk<72  = E(R_S)  also  has  a limit,  and  this  limit  will  be  denoted  by 


5*  • 


Lemma  3.2.2.  For  £ = 1,...,  q+1  and  m = 1,...,  q+1 


Pj[  Pit  - Pit 
~£  ~m  ~min{£,m} 


r|e 


Proof . For  £ = m,  this  is  a property  of  projection  matrices,  so 
without  loss  of  generality  £<m. 


88 


Pjj  Pit  — Py  Py 
^£  Vm  *£- 1 -m-1 


+ P5f-l(!  “ “ P*mJ 

+ (l  - - px1.1>o6»-i,  - O 

= pxt_  L+ 

(since  PY  PY  = PY  for  £ < m) 

~£-l  -m-1  -£-l 


- piv 


Lemma  3.2.3  For  any  d £ Rcl+^\{0}  with  i ^ component  d^ 


where  6-  = V d • and  P.T  = I 

1 jti j yq+2 


Proof . 


89 


Lemma  3.2.4  Under  model  (3.1.1)  with  assumptions  A1  and  A2,  for  any 
d € Rq+1\{Q)  if  Qk  = dTSSE , then  ±Var(Qk)  > e for  some  f>0  and  k 


large . 
Proof . 


Var(Qk) 


Var 


q+1 

,?,d‘ 


(Y  — EY)T(l  — Py  _)(Y  — EY) 


= Var 


q+1  / q 

+ 


i - Pi 


90 


+ 4Var 


J^a)  (il1di(I-pui))  s + Var  eT(^1di(*-p«i)) 5 


> Var 


r fq+i  I2! 

> min{(^4)e-4);  2«t|}  tr|  £^1(1 -Py.) 


by  Lemma  2.2.1 


= mi 


n{(^4,e~°e)>  2(Te}  .5Z)(5i{rank(PUi+1 ) “ rank(PUi)}  by  Lemma 


3.2.3 


where  6-  = £ d-  and  U0  = 0. 
j=l  J 

Now  assuming  A*4  e > (t^>0,  one  gets 


^Var(Qjc)  > constant  £ rank(Pjj  ^ rank(P^_  )| 

= constant  £ <52{j^  rank(X^+j)  — ^ rank(X^)|. 


Then  ^Var(Qj.)  > positive  constant  •£  > 0.  Q.E.D. 

Lemma  3.2.5.  Under  model  (3.1.1)  with  assumptions  A1  and  A2,  for  any 
d G ^ \ {Q}  where  d^  denotes  the  itk  component  of  d,  and  e>0 

satisfies  ^Var(Q^)  > t in  lemma  3.2.4  then 


kvyT(k?iSj(Yi))j = ^Var  j?i  i?idjgj(Yi) 

Proof . According  to  (3.2.1), 


q+i  k 


- 2 " 


.ESj+lOfi)  = SSEj+1 


+ constant 


i < < ua«  -ej«)T(j  - pxjt>o,<cr.?J,»(!  - %>«  - «.) 


91 


Note  that  for  any  j ^ j7 

Cov[ssej+1;  EE  (Y,-ey£)t(i-px  Jx .sj,d(l -Px )(Y.-EY.) 

J l<£#m<k  v ~J  « ’ J ~J  m 


= — Cov 


t < E E < k(Y, - ey4)T(j - Px.{>MCjr.?So,m(l - Px  J(Y„ - EY„) 


' Jm 


t < E E < k(Y{-EY,)T(i  - Px  .,£>o,«5j '.lUl-h  Jn.-**.) 


■J  m 


2 EE  E(Y£-EY£)T(l-Px  )x  MC-xJ  (i_p  )(Ym-EYm) 


(Y,-ey{)t(i  - px  )x  fcp  xj,„(l  - px  )(Y 

\ ~ji  J ~j  m 


m - EYm) 


= -2,  JE  e(y£-ey()t(i-p  )x0>£crxj>d(i-px  J(Y„,-EY„> 

1 < *•  +-  m < k J1  1111 


' jm 


(Ym-EYii)T(l  -PX >on,Cy.sJ,f(l  -P.xyt)(Vt-EYe) 

= -2,  j£<Ai 

CSV”4"'"-'")1  - -px 


Thus 


CovTsse j+1 , EE  (Yf -EY{)T(l - PX K«- j7. -»»(J “ px )« 

^ r ni  S ^ J ^ J ni 

= 2!  < < HCSV-^41  - p*>5^  - %J 


m-EYm) 


92 


■ 2i 

-px  .,>»5y.sU! -^xyt)} 

5 Ctl"”  l<?<kChL^iSjiOlr"'  istsk^n'i^'i] 

li  iHJ  - - P5 j J1  - - V 

£ CjM'ri'2  1 “ < kchi-(S«i5»i)) 

tr{  ExJ/l  -px  . v j(>(K,«Sr.151«i(l  ■ -px  j v y>mMf) 

= 2 ( £ o'?  max  ehffx  ?X^?H  trie  . ./  C r C . .,  C T,  } 

Vn=jAj'+l  1 < 1 < k L\'ni'ni7  I'JV  y.-J.-jV  j .'j  J 
= 2|  V (t?  max  ch.fx  • X J • ) ) minfrank  C-  ; rank  C ./  1. 

UiAi-ti " is»sk  LUl'ni7  \ -j.  -j./ 


So  for  k sufficiently  large 


i Cov  sSEj+1 , , s E E s k(V|  - EVf)T(l  - Px  .sUl  - pXjJ« 


Ym-EYm) 


4(q+l  )“||  d ||“ 


Thus 


r Cov 


-Ts~E,SdJ  1 < < k(Yf  - ey‘)T(j  - - p*  j J(Y 


Ym-EYm) 


£ 

4 * 


< 


Now  apply  this  to  ^Var 


= EVar{j|0dj{SSEj+l 

1 = 1 1 ^1  J1 


|0djSSEj+i 


> 


|0djSSEj+i 


•1  k k / 

Ed|  E E (Yi-EY.)(i 

j=0  J i=l  i'jfci  1 v 

> e - § > o. 


rxT./l 
J • ~oi  V 


Theorem  3.2.1  Under  model  (3.1.1)  with  assumptions  A1  and  A2 


94 


>fk  V"2 


k \q+1  ( k \q+1 


K0'  -q+l)' 


Proof . By  the  Cramer  Wald  device,  it  suffices  to  show  for  any 
nonzero  vector  d 6 Rq+1\{Q} 


>fk  (dTVd)"2  dT 


vq+i 


,q+l 


a5igj(Vi)jj=1  “ Wj(Vi7j=1 


N(0,  1), 


For  convenience,  let  q^  = d^^gj(Y^)J  and  = d1(SSEj).  It  follows 
from  Theorem  2 of  Whittle  (1960)  that  for  j = 1,...,  q+1,  i = 1»...,  k 


- 


E|(Yi-EYi)T(ln.-Py.i)(Yi-EVi)  - E(Yi-EYi)T(!n.-PUj.)(Yi-EYi) 


2+6 


( 

< Lj(2  + ODjlran^H.-Py..)! 


(3.2.2) 


where  Lj  and  Dj  are  independent  of  k,  Lj  is  a function  of  6 only  and 

nhit21-’  ■■■■•  l#qif+2<!  lies lP+2i} 

< max  { D • ) . 

~ 1 < j < q+1  J 


Now 


|-H--PVjiXi  * EYKl-Pg..^ 


2+<§) 


= 0(k) 


implies  1 

i=l 


q+1 


q+1 


2+6, 


Edjgj(Y •)  - E Edigi(Yi) 

j=l  J J j=l  J J 


= 0(k) , 


95 


Lemma  3.2.5  and  this  last  statement  are  what  is  necessary  in  the 
Liapounov  condition  to  show 

<k  (dTyd)"2dT(qk  - Eqk)  4 N(0,  1). 

This  proves  the  univariate  central  limit  theorem  from  which  the 
multivariate  central  limit  theorem  follows.  Q.E.D. 

3.3  Jackknife  Estimates 


It  is  the  purpose  of  this  section  to  show  that  the  jackknife 

estimates  obey  a central  limit  theorem  and  that  the  jackknife 

estimate  of  variance  converges  in  probability. 

As  in  Chapter  2,  denote  Y • = col  (Y*)  and  j = 1,...,  q+1 

l<£<k  £ 

where  P.  = 5 j ,-i)_  * j, -i' ' F°r  = 1--  " 

J 5 1 


U, 


= PX;  s 


~ j+l>-i  ~ j * — i 


+ In  _n  -P 


CTi.  Ah 


-p^ 


±n.  -ni  ~ rXj  _i  y^°  s-i-j-!;-  i -°  ni  "rXj?_i 


and  PIL  • = ^o.-iK-i'O.-i)  “0,-i-  Then  for  J = 1»-»  °1+1' 

~ 1 5 ” 1 


SSE 


j>-i  = Yli(in..n.-Py._.)v_i. 


96 


Next 


denote  r{.  = tr(x?( I - Py . )X j)  if  j > i for  1 < i < 

<*-V) 


rq+l  = tr| 


r • = r • • 
1 li 


The 


V 

jackknife  estimate  of  a"j  for  j = 1,...,  q+1  is 


-2  -2  k-1 

j , jack  j k \ = \ J>-1 


E * 


r • 

Denote  for  j = 1,...,  q+2,  n^  = lim  -rr  and  T • = SSE*. 

J k — >oo  K J nj  J 

Lemma  3.3.1  Under  model  (3.1.1)  with  assumptions  A1 , A2,  and  A3 

Va^Tj-jMSEj)  = <# 


Proof , 


Varfl  • --J-SSE  ■)  = Vaiff-4 ^ SSE- 

V J 1 j J / ^ \n jk  0 y J 

= & - '?  j:  Var(ssJEj) 

^ J 

= o(l)  0(1)  \ Var(SSEj) 


(3.2.2) 


= °( 1 ) 0(1)  \ 0(1)  £ rankfl  — Pit  . by 

K i=l  V -Jl7 


= il 


The  following  representations  are  useful  tools. 
2,...,  q +1: 


"jU.jack  = "jKTj  - »5  V £ U,-i 


q.E.D. 


First  for  j = 


97 


= .E(Yi-EYi)T(i  -r?  VYi-EYi) 

1=1  J"1*1 

- EfYi - EYi)T(i.. - px. , .K;Cj7 iU In; - px. , .>Yi - EYj) 


, sH5s  k(Yi  - EYi)T(!-i  - %M'  - %-m>y 


k k 


E E E .<Y,  - EY,)T(ln<  - Px.  ] {)y(  - EY<) 


+ E E .(Y, - - P^Jwi  (in,  - Px. , ,)Y, 


+ E i £ , < ,££  < k^-^,)T(.n,-Px.  (in 

(#0 


nJTj 


m 


E , <££<  t(Y>  - EYi)T(»"i  - %i,M-  ^e-pxH,) 


E ,£  S .(Y,  - EY,)T(ln,  - r^jsjffo  - CjT K/in,  - P) 


+ i E EE  (Y£ - EY£)T(i  - Px.  )x0 /c.-j - c.- )xT 
K i=l  1 < i / m < k V * °’ty  J>  1 J-  ' 

(#i) 


)• 


For  the  special  case  of  j = 1 , 


"lb,  jack  = "T^l  - "J  ^ £ T 

1 = 1 


= n*Tl 


e-EY£) 
t~E  Ye) 

PX.t  )Ym-EYm) 
(3.3.2  a) 

Ve-EYe) 

h-JY‘-EY‘> 


(3.3.2  b) 


98 


+ i EE  (Y£-EY£)TXq  ,(xTx0)-1xTm(Ym-EYm) 
1 < £ m < k 


+ i E E (Yj-EY^X  iX„  .)  - (xJXoJ-^Yj-EY,) 

1 = 1 tjt  1 1-  -I 


k 


+ £ E EE  CYt  - EY,)Txoe  (xT  .*  j) ' 

1=1  1 < t / m < k ’ 

(#i) 


- (xTx0)-1]xTn(Ym_EYm). 


(3.3.3) 


Lemma  3.3.2.  For  j = 2,...,  q + 1 using  the  notation  Cj^  = 

^o,l(ln|“pXj_1)So,«  then  Cj.  = EGj|.  Denote  rank  (C j ) = Sj  < p0  = 
rank  (X0)  . It  is  possible  to  choose  a sequence  of  C • integers,  £•  < 

(Cj  \ 

1 < ii  < ...  < if  < k so  that  coll  C col(C-  ). 

J j Vm=l'J  ~3’ 

Proof.  Choose  the  minimal  i^  so  that  rank(C--  ) > 1.  If  rank(C--  ) 

= Sj  the  process  is  done;  else  choose  the  minimal  i2  > ij  so  that 

rank(Cj^  + Qj^)  > rank(C j i ^ ) + 1 > 2.  By  induction,  there  is  a 

sequence  1 < ij  < i2  ...  < i^  < k,  where  1 < i ■ < s . < pQ , so 

j J J 

that 

q-e-d- 

Notational ly , let  I - = { i , . . . , i«  }.  Then  for  any  i £ I • 

Lemma  3.3.3  Under  model  (3.1.1)  with  assumptions  A1  and  A2,  for  j = 
2,...,  q+1 


'IE(Tj-Tj,jack)-0- 


99 


Proof . By  using  representation  (3.3.2),  it  suffices  to  show  the  last 
three  terras  in  that  representation  converge  to  0 in  probability  at 
the  proper  rate,  i.e.,  are 


i)  First  consider  EE  (Y£-EY£)fln  - Px  . )x  £C  - X. 

l<£^m<k  1 ~ j-l ; t7 

i < < k(Y'  “ EY')(ln« _ ~ph-i  , >■  _ EYm) 


i 


2 EE  tr 

1 < £ ^ m < k 


(V p*  j.i  pxj_1 


/q+1  /. 
< 2 Y max  chrl. 

“ Vntj  1 < i < k LV 


^ni -ni  Vn 


E E 

1 < l <k 


tr 


= 0(Po)  E E trfci  gcrc.  mcrl 

1 <£^m<k  L J’  J‘  J’ 


£ 0(P“)  Sl  Si 


= 0(pjf)tr-[Cj  Cj  Cj  Oj  ] 

= °<Po)f[Cj.Cjr.] 

= 0(Po)  rank(C-  ) 

J ' 


100 


< °(Po)  • Po  = 0(Po)- 
ii)  Next  consider  the  term 


£( , < hi  < - ey')T(!"«  - pxj.1,(K,fei  - ci:K»k  - pxi.lim>vm  - EYm). 

(#i) 


E{(l/,<F^<k>Y'-EY«)T(!"rpx,MK,( 

(#i) 


\21 

fe-T) 

^m(lnm-Px.lm)(Ym-EYm)J| 

= I M . < hi  < 

(#i) 

(lnm  - PX.  . )(Ym  - EYm)(  £ E )y  ; - EYg,)T(ln  . - Px  Jxq  £ 
V m -j-l.m'  1 < l # m < V 1 £ V i'  -j-1  ,r  0,t 

(#i) 

(5j>  - T KUinm,  - px.  , ,mK'-EYm')} 

k 

= 2E 

k 1=1 

EE  EE  (Y£-EYg)T(l  -Px-  X/ci-i-ci-)xJm(ln  -Pv  ) 
i'=l  1 < l # m < k V « °A  J>  1 J-  ' ,mV  nm  ~ j-1, nr 

(#M') 


(Ym  EYm 


k k 


XYm  - EYm)T(!„m  - Px,lm)Xo,m(?j>-T  K,fint  ~ %lf>Yt  - EYf) 


= 2E  E(  EE  V Var(Y£)(ln  -px  K/Cri-CrWm 

i=l  i'=lVl  < i t m < k'  “«  j.  /-o,m 

(In™  " P^j-1,JVar(Ym)(Inm  “ - cj~  ~ px._M) 


101 


< {' "e  , w*  A(xnixJi>n)  £ £ ( £ £ ) 

\n=j  1 < i < k ^ nl  niy  / i=l  i'=ivl  <£^m<ky 

(#M0 


/q+i 


0(p“)ili  ^ " c^: + wK<w>  - + 1 jj-'i. ) 


0(P3)  E f t, 

i=l  i'=l 


fC;  jC;-.  - C;  C;-YC.  ./C.-.,  - C;  C;_) 

\~Jri~J.-i  -J-'J-  A~j,-i  ~j,-x  'J-'J-  / 


+ C;:C.-(c  ,cr.,  - c.  C.-)  + (c.  :Cj~;  - c.  c.-)c.  .,C.~+  C-  -C.-C.  „C.- 

J1  J-  v J>-i  j;-i  J-  J-  / v Jr>  J)-i  /~j;i  -J-  ~ J»i ~ J-  -j,-i  -J. 


— rwv.  2\ 


(J 

r k 
£ 

(c.  -cr.  - C.  Cf ) ] 

^ fc.  .,C - C-  c.-) 

i=l 

V'Jri'Jri  / ./ 

"jV'J.-i  ~J,-i  'J-'J-  / 

V Nil 

i‘j7i  - Cj.cj:) 

4-  tr 

'jtfij 

(3.3.4) 


Note  that  it  is  possible  to  choose  a generalized  inverse  so  that 

Qi  is  symmetric  for  each  i,  as  well  as  C-  C-~  is  symmetric. 

This  will  force  C*  • to  satisfy  the  third  Moore  Penrose  condition, 
j > ~ 

In  this  case  Cj.-Cj  ^ is  idempotent  and  symmetric,  i.e.,  it  is  a 
projection  matrix. 

Recalling  the  definition  of  Ij  above,  one  sees  that  for  any  I 0 

I - that  C-  . has  the  same  column  space  as  C • . Thus  both  C-  CT  and 
J J ’ ~ J • ~ J • ~ J • 

are  projections  onto  the  same  column  space.  Since 

projections  are  unique,  C-  C r = C-  -CT  ..  This  shows  that  the 

J * J * J 9 1 J 1 “ -1 


102 


right  hand  side  of  (3.3.4)  is  equal  to 


0(Po))tr 


V (c.  -C“  • - C-  cr)  T (c.  . ,C 7 - C- 

i ei  i/gj  ~j;-i 


+ 2tr 


Sj.«i.  - Cj.cr) 


+ rank(C 


Observe  that  the  column  space  of  C •.  -C*  • is  contained  in  the 

column  space  of  Cj  Cj~  and  both  are  projection  matrices.  Thus, 

C • C 7 — C*  :Ci".  • is  also  a projection  matrix.  Now  applying  Lemma 

2.3.1  gives 


tr 


£ fc ;cr  . - c.  C r)  £ (c.  .,C7  .,  - C-  Cr) 
iel.v  j’  J’  J ' i'ei.  Jl‘i  Ji'i  ~J-~J •/ 

J J 

< s]  j»i„{ra„k(Cj  CJ.  - Cj._iSj-._f)}  < 


Po 


and  tr 


2 

< Po- 


The  result  of  this  is  that  (3.3.4)  is  0(pg)  and  under  assumption  A1 : 


Po  n , 

-f=  — * 0 as  k — > oo. 
\k 


iii)First  for  i 0 Ij  consider  the  terms 


. ( V4 - EY{)T(inr PS . ^ ()x0  Jc r . . - C J i 


By  the  construction  of  Ij  the  condition  from  Henderson  and  Searle 


103 


(1976)  is  satisfied  so  that  C-  •— C-  is  nonnegative  definite. 

~ J > ~ 1 J • 


J f,  % , - eY<)T(! n{  - pSM>0,fe  - cj7  K«(fn(  - pxj.u)(Te  - EY£)J 


= E 
i0I 


J 

/ 9+1 


E .*rV«(Y  ^ - Px.  , >olffei  - Cj7  - P^J 

£ C?j  i<T<kch^"i^.iH)  ifj.  t$i  tr[5ji5r;-i-ej.) 

- C?j  ifx.  tr[cj.-i(cr.-i-cj.)] 

= 0(po)  . E J;-i  - Cj.«i.  * Sj.iCJ.] 

= 0(Po)  if,  tr[-j;iei] 

< 0(Po)  tr[Cj.Cr] 

- 0(P?). 


Thus  4= 


A i fi.  $ i(Y{ ■ ey',t(!“* ■ pxH>o,«fei - cr ~ EY<> 

converges  to  0 in  and  hence  in  probability.  Next  treat  the  terms 


for  i € I j . 


EE  E (Y£- 

ieH^i 

ci:KK~\JY 

= 

E Etr 
i el  t ^ i 

-cfK<(sy -pXj.u)] 

/ q+i 

V 

W!nrpxj.uKfei 

< 

E E 

i € I \n=j 

max  chrfx 
1 < i < k LV-m-n,!^ 

104 


- i¥i 0(Po) 


< O(p0Krank  Cj.  jj  + rankpk 

< 2&fp0)  • sj  rankpj  J = O(pjj). 


Lemma  3.3.4  Under  model  (3.1.1)  with  assumptions  A1  and  A2 


'Kb -Tl,  jack)  ' 0. 


Proof.  By  representation  (3.3.3)  it  suffices  to  show  the  last  three 
terms  in  (3.3.3)  converge  to  0 in  probability  when  multiplied  by  -\[k. 

i>  E{^  [ <EE<k(V{-EYe)\i{(xJx0)-1xLm(Y„-EYjj2| 

= ^ EE  Ef(Y4-EYt)TX0>,(xJx0)-1xYB(Vm-EYm)[ 

= I ,<F<?<k  t4o,f(SoTX„)-1xl.(|iXimXTma?  + Vl.Al.-'i) 

X„,m(xJXo)-150T,/.|iXin,xTina?  + Sq+i,m!!j+1:J.«l)} 


S Ri?i  l<T<k  chL^'ij'ij)<Ti  + °e 


EE  tr 
1 < £ < m < k 


< i 0(P?)  £ £ tr[x  ^xjxo)-1^ mX0  (^Xo)-1^  1 

t— 1 m=l  L ’ J 


~ o , £ ( ~o  - 0 ) ~ 1 ~ o , m - 0 , m ( ~o  ~ o ) _ ^ , f] 


= i 0(po)  tr[ho] 


_ 1 
fk 

E{l 


„3> 


E EE  (Yj  — EYj)‘x  0 e {*^-1*0  .l) ' 

i=l  1 < i < m < k £ 1 0,LL  1 0)1 

(#i) 


-a 

- (xTxor1]xTm(Ym_EYm) 


k k 


4E  EE  EE  (Yf  - EYg)Tx0 1 (xj_  jX0 _.) " - (xj?,,)-1 

Nki=li'=l  ll<£<m<k  o,^  o,i  o,i  j 

(#0 

xJm(Ym  - EYm)  £ E (Y£/  - EY£,)TX0  JfxJ  X ) " - (X^)-1 

l<£'<m'<k  ’ L 0,1  0,1 


( #i') 


X1  ,(Y  ,-EY  ,) 

~0,m  v~m  ~ m y 


= |E  Ee|  EE  (Y£ - ey£)tx0 I(x*_ jX0 .j) - - (x^r1] 

i'=l  ll<«<m<k  £ °’T  0,1  0,1  -1 

(#M0 

X^m(Ym-EYm)(Ym-EYm)TX0)m[(xTiX0ri)-  - (xjx0)_1j  X^£(Y£  - EY£)| 


=£iS£i<F<5<k^  - 

(#wo 


k k 


/T  2 


Tv  x-1 


rTV  Y-l 


-0,1 1 


k k 


< k 


106 


< J 0(p?)  £ .£ 


k k 


rTv  \-l 


Tv  i-l 


= i OO&.E  £ fjK A,i)(aSo)-1^,i-50,i-)(5j!!o) 

* i °(po)  tr[ip„] 

= £ 0(P?)> 

iii)  Finally  consider 

i .£  E.a£-Eltl)T(g£>.i-Se,. )(!«-*!(«) 

= i £ T E.(Y£-ey£)t[x0>£(xT  _iX0>_i)-xT  £ - x0jt(x3'x0)-1x3'>(](v,-EYt) 

1 f-  I®  T-  1 

E.(Y£-EY()1[x0>j(x3'i_iX0>_i)-X0Ti{  - X0 ^(xJXoJ-ixJ^Yj-EYj). 
For  each  i*I,  ^(5*  _ . X0> ) * X* f - X0>((xJx0)-1xJ>{  is 

nonnegative  definite  according  to  Henderson  and  Searle  (1981)  because 
the  condition  col(xJ  -X  •)  C colfxj  -X  is  satisfied.  Thus, 

E E.(Yj-EY{)TX  J(xT  X )-l  - (xjx0)-1lxl' >£(Y£-EYt) 
i £ I £ ^ i J 

is  a nonnegative  function. 

E .E  E.(Y(-EYt)Yx0^xT_.X0i_i)-l  - ( sjjto ) ~ 1 £(Y£- EYf) 


; E E 


+ Ve^o.lK.-iVi)-1  - (STSo)'1]*?,*} 


£ - ,xTx0)-l]xT 


1 1 «# 


= 0(p„)  eJixJ.jX  ifxTjX  i)-1  - (xjxp)-1 
1011  L 


< O(p0)  £ tr|xT.Xo  i(xTxor1|  = O(p0)tr(lpo) 


= 0(pg). 


Now 


EE< 

i € I 


E (Y£-ey£)tx oi(xJ_iX  j)“  - (xJXorijxT^-EY,) 

t i n- 


t2'| 


= E E< 
i ei 


' E (Ye-ETj)TxJ(xTiX  ,)"  - (xfXor^Yj-EYj) 


= ,E  E.tf(Yt-EY|)TX0i(((xTiX0i.i)-  - (XJXot-^^Yj-EYf) 


+ E EE  E(Yf-EY,)TX J(xT|X  ,)-  - (xJXor^lYj-EYj) 
i € I 1 < £ ^ £'  < k 
(#0 

(Y{,-EY{,)Tx0j(((xTix0.i)-  - (xJxor'^tY^-ET,,) 


,E  E v«|(Y<-EYj)Tx0j<((xT..x0i..)-  - (x0Tx0)-i)xT  (Yt-EYf)j 


+ E E 

iei  i± 


.{*- )"  - (XjXor'K^EXjjX^I  + 1„^)  | 

- (xfxor'^l^xT,!  + I„( 


+ E EE  lr  x0  /(xTjX  ) 
i G I 1 < l ± t'  < k L ’ v 
(*i) 


IO 


108 


tr 


*o,<K-i*o,-L>  ~ (XoXo)  + !n/ 


= E E Va,  (Y4-EY{)Tx Jfx^x  i)-  - (xTx„r‘)xJ£(Y,-EY4) 

i £ I l± i l ’ ) 

+ e e e >{*o, <(«£-&,, -i> ' - <^?or1)*?/i:  + V 

i £ I e=l  t'= 1 \j — J- 

(#0(#i) 

tr  -J-i ” _ + V7*, 

< E E Vai(Ye-EY|)Tx o^fx^x  |)-  - (xTxoY'^Yj-eyJ 


+ Ei|t{«J.ix0,.i)(xJ.ix0>.i)-  -(xjx0rl; 

= E E Vai(Y,-EY£)TX  J(xT  x M)-  - (xJjSoI-^t^-ey,)} 
i G I £ ^ i l J 

+ 0(P5)  E {r.nk^J  - Po  + ' JjX0  |( X„'  xo j1  )J 

= E E v«J(Y£-EYe)Tx  <((xT.x  ;)-  - (x„Tx0)-')x0T  (Y  - ey{)1  + 0(P„5) 

i G I £ ^ i l ’ ’ ’ J 

= if,  i-ll' - (*I*°r‘Xex je^} 

+ 2i?i  «?  i x Cj  5j^ - (sJSoT1)*^^] 


+ 0(Po) 


i|(p4J-3a-f)t,fxTxoi(XMXo,-i)'  - (xJxor>)xT  Xj  { 


q+1 

E E E 4 (P4  j ~ 3<x?)tr 
j=i  iei  i±  \ { d J 


109 


di4xifXoi(^.iVir  - 


+ 2<r^tr 


zji\lal$o,.0-  ~ (x?Xo 


*X<(«MVir  “ 


+ 2 £ E EE 

i€l  1 <j  < q+1  J 


^Xoi(xJ-iSo,-i)'  - (XoXor1)?^.,, 


+ O(Po) 


X^^IXT  ;XM)  - - (xJXor^T 

iW  - (XoT?or‘)x0T«Xj,< 


q+1  4 

E 0^4  ; - 3of ) E E 

j=l  J J i€l  £^i 


xLx.  li 

x? 

•X 

-J  «- 

u>tv 

~ 0 

q+1 

E 

j=i 

"4j 

-3’j 

E 

E 

tr 

X0 

iei 

U1 

- <^„r‘)xj£xj( 

+ 2J)  E^.E  E.t-^xJfxJ.^.r  - (xfxor^Xj,, 

+ O(Po) 

max  chT  (X--X-1 ) ) 

1 < i < k Lv'J*'Ji;y 

Xoi(xJ.iX„,.i> ' - (XJx0)-1)xT^ia^X0i<((xT_ix„i.i)  - - «Tx„)-l)xT  ] 

+ 2 *£  *£  crfor?/  max  chT  (X..xT)Y  max  chr  (X./.X^.)) 

j=l  fii  J J\1  < i < k LV-.P-Ji'^l  <i<k  V-j'i'jV) 

ill  £!il{*°i«MXo,.i>~  - (*?«o)'1)x^o,((«J-iX0,.ir  - (X^o)-‘)xJe] 


< Ofp^l.E  - (xJxor‘)xT 


+ 0(pi)Ei  E.trjxJlxTjX^)-  - (xJx0r')xT  ^ 

(«£-i?o,-i>~  - (X?x0r1)xj<|  +o(Po5) 

= 0(p2)  E E.4xJ(xTiXM)-  - (xTx0r')xTf 

1 t 1 t y—  1 v 

^UBfArf  - («?So)-1>?l]} 

+ O(Po)- 


It  is  necessary  to  make  some  assumptions  to  control  the  last  term 
Similar  to  A2  (ii)  of  chapter  two  is  assumption  A3: 


E 

i ei 


E 


dia§M(X  0,-i*0,-i) 

which  will  hold  if 


= 0(p?) 


l<ix<k  1<!<ktr[-°^('0’-i-0’-i)  ” (^So)"1)#,*]  = 0(Po). 

Then  under  A3  the  last  term  is  0(pg)  and  the  lemma  is  proven. 

It  should  be  pointed  out  that  if  -—3 <r\  < 0,  for  j = 1,.. 

^ » J J 


q+1 , then  no  extra  assumption  is  necessary. 


Ill 


Theorem  3.3.2  Under  model 


1 

>fk  V"2 


(3.1.1)  with  assumptions  A1 , A2,  and 
- <Tj,jack)j!)  } - Vi<8'  W" 


A3 


Proof . Combining  Theorem  3.3.1  with  Lemmas  3.3.3  and  3.3.4  gives  the 
result  by  Slutsky’s  Theorem. 


Theorem  3.3.3  Assume  that  g is  a real  valued  function  of  q+1 
variables,  g:  — ► R,  with  continuous  second  partials  which  are 

bounded  away  from  zero  in  a neighborhood  of  R*<x  . Under  model 
(3.1.1)  with  assumptions  A1 , A2,  and  A3, 


where  g^  = 


4<(Tj.j*ck>p  - s(8„e2)}  4 Nl(o,  5 5 

0 and  is  the  (i,j)  element  of  Y- 


dg 


dxi 

Proof . This  is  an  immediate  result  of  Theorem  3.2.1  and  the  delta 

/ q+l\ 

method.  A Taylor  expansion  of  g((Tj  jacjc)  ) a neighborhood  of 
gives 


+ 


q+i 

E 

j=i 


dg_ 

9X] 


(R*?2) 


q+l 

E 

M 


q+l  q+l 

+ E E 

J=i  j'=i 


s2g 


dx-dx., 

J J 


t V?XTrt- 


q+l 

E 


112 


2 • • 

where  £ is  on  a line  segment  between  0 and  R a~ , and  rjg  is  the  j£ 
entry  of  R* . Thus 


- S(B.!2) 


th 


j=l  “j 


q+i 

^jjack  ~ ^ T\f<TP 


(R*?2) 


<=j 


q+l  q+1  g2 

+ ^ E E 8 


(Tj Jack  rj,^<T^Ij/jack  g^,  rj/,€<T^)* 


j=i  A axj5xj' 

The  claim  will  follow  from  Theorem  3.3.2  if  it  can  be  shown  that  the 
last  term  converges  to  zero  in  probability. 


Now  if  it  can  be  shown  that  >Jk 


^g 


< q+l 

Jj, jack  “ ^ F j,< 


o \ 

i (T  f)  I 3.S 


i — dz& 

well  as 

dx ./ 


q+l 


Jj'jack  - C°nVerge  ‘°  ° in 


probability,  then  the  cross  product  term 


<k 


d2g 


dx  -dx 
J J 


q+l 


q+l 


j , jack  ^ rj£<rfj^Tj/,  jack  r j'lat 


must  converge  to  0 in  probability  by  the  Cauchy  Schwarz  inequality. 


According  to  Theorem  3.3.2, 


^jjack  - |!rr?)  i N(°, 


113 


Thus  k^Tj  jack  - £ rjt<Tf)  ^ squared.  Finally  apply  Slutsky’s 

J 


Theorem  to 


'"{b.jack  - i ' k(TJ,jack  - S/jr?) 


q+1 


T,  , 


Sc 


. . , — V r -fCTp)  converges  in  probability 

j , jack  je  tj 


q+1  q2 

so  that  -Jk  Y'  — r 

j=l 

to  0 since  g has  continuous  second  partials  in  a neighborhood  of  £. 
This  proves  the  theorem. 

Now  for  each  estimate  T j jacic » J = !>•••»  S+l*  the  jackknife 
estimate  of  the  variance  is  given  by 


Theorem  3.3.4  Under  model  (3.1.1)  with  assumptions  A1 , A2,  and  A3, 
the  jackknife  estimate  of  variance  converges  in  probability  to  the 
asymptotic  variance  of  the  estimator. 

Proof . First  observe  that 


<k-1)STw-^,TH 


= (t-l)|i(Tj,.i-Tj  + Tj-I(ET..fj 

= <k  - - tj)2 + - TiXTj  - 1 + (h  - e k 2] 


114 


The  last  term  is  shown  to  go  to  zero  in  probability. 


(k  — 1) 


- k-1 
" k 


- k-1 
" k 


1 

k 


k 

£ 

£=1 


- k-1 
" k 


(k-DTj 


k-1 

k 


k - l{T j T j » jack} 


which  converges  to  0 in  probability  by  Lemma  3.3.3  for  j = 2,...,  q+1 
or  by  Lemma  3.3.4  for  j = 1 . 

Now  if  the  first  term  converges  to  a finite  quantity  in 
probability,  the  cross  product  term  converges  to  0 in  probability  by 
Cauchy  Schwarz.  Thus 


(k-1) 


* k / \2 

converges  in  probability  to  n’w--  if  and  only  if  (k-1)  £ (T  • --T- 

J J J i_i \ J > 1 J/ 

also  does.  Next  note  that  according  to  Lemma  3.2.1  it  suffices  to 

approximate  n^T  • • by  yy  1 £ g -(Y.)  and  similarly  n^T  • by 

JJ’1  1 J £ ^ i J c J J 


115 


j*.  Egi(Yi)-  Therefore, 

K i = l J 

cp2(i-i)|i(Tj,.i-Tj)2 

l2 

= (k  - Tjri  -jljl  ,(gj(Ye)  - Egj(Yf))  - n?Tj  + \ E (gjfYi)  - ^(Yj))} 

+ 2(k  - 1)E  {njTj, ^ E .(gj(Yf)  - Egj(Y{))  - n|Tj  + J E ^(Yj)  - Egj(Yj)) 


{f  l/sjfYi)  - Egj(Yi))  - E ,(gj(Yf)  - Egj(Y<))| 

+ <k  - ^{e  l/kjlYi)  - Egj(Y|))  - E ,(gj(Y4)  - Egj(Yg))} 

= jij  O(p0)  + (k  - (gj(Y,)  - EgjtYi))  + (i  - jir)  E .(g j(Ye)  - ESj(Yf))} 

= “<»  + |,(gj(Yi)  - Egj<Yi>)2  + 4>  E (gj(Yi)  - EgjIYi))  E .(gj(Y£)  - EgjCY4>) 


k - 1 


k2(k-l)2  i=i 


E/gjfY^-EgjfYj)) 


i2 


1 ( \ 2 

If  r (g  i(Y;  ) — Eg  -(Y; ) ) converges  to  v--  in  probability,  then  the 

K | -j^V  J ^ J / JJ 

remaining  terms  must  be  o(l)  by  Cauchy  Schwarz. 

Eii|1(*j«i>-E*j«  of 


- E I,  v“K*j«i>) 


116 


= i Var(i|1gjai)) 

= i v“(SSEj - ) - SSEj) 

= i Var(SSEj)  + jj  Cov/sSEji  E gjQ'i)  - SSEj 

+ “ SSEj) 


= ± Var(SSEj)  + o(l). 


By  A2,  r:  Var(SSE-)  — ► v • • so  by  Theorem  B(i),  p.  275,  of  Loeve 

K J J J 

converges  in  probability  to  vjj*  This  proves 

the  theorem. 

We  are  now  in  a position  to  prove  that  the  jackknife  estimator, 
gjackC^i  ’•••’Tg  + 1) , Is  asymptotically  equivalent  to  the  function  of  the 
jackknife  estimators,  g(Tj  , jack’" ,,Tq  + 1 , jack) ’ but  the  Proof  dePends 
on  the  following  technical  lemma. 

Lemma  3.3.6  Under  model  (3.1.1)  with  assumptions  A1 , A2,  and  A3 


max  |SSE-—  SSE-  - | = Op(vTk)  for  j=l,...,q+l  (3.3.6) 

< i < k J J ’ 1 r 


Proof . The  case  of  j=l  will  be  handled  first.  According  to  (3.3.2  b) 


SSEj  — SSE  _i  = (Yi-EYi)T(In.-Px  .KY.-EJf.) 

’ x x 1 ~oi 


117 


+ EE  (Y.-EYi)Tg.|(Y£-EY£) 

1 < J #« <k  J J J 

+ EE  (Yj-EYj^-C^  i](Y£-EY£) 

l<j^£<k  J J J 

(*i) 


+ .E.(Yj-Evj)'[cjj,.i 


'.iPVr'V 


By  part  i)  of  Lemma  3.3.4  we  see  that 


EE  (Yi-EY.)TCi£(Y£-EY£)=0(p2) 


By  part  ii)  of  Lemma  3.3.4  we  see  that 


EE  (Yj-EYj)T[CJ{  — Cj,  _iKY,-EY()=0(pJ) 
l<j^£<k  J J J J ’ 

(#i) 

By  part  iii)  of  lemma  3.3.4  we  see  that 


.E.(Yj-EYj)T[Cjj,.i-CjJ](YJ-EYj)  = 0(pJ). 

Thus 

SSEj  - SSE  ..^(Y.-Ey.^In.-Px  . )(Y. -EY.)  + 0(p*)  . 
Now  by  Chebychev’s  inequality,  for  any  e > 0 

K 1<T<k«i-E»i)T(I»,-pxoi)(b-EVi)>^) 

<.E^E(Yi-EYi)T'(In.-Pi(i)i)(Yi-EYi) 


118 


= 0(1), 

by  A1 . 

ii)  According  to  (3.3.2  a)  for  j = 2,...,q 


SSEj+1-SSE2j_i  = (Yi-EYi)T(In.-Pj..)(Yi-EVi) 


- EE  (Ym-EYm)'(inm-Ps.  )x0lnCj,xJ£(i „ -Pj  XYj-EYe) 


l<m^<k 


■ jm 


l e' 


+ E ( YrEYt) 1 ( In4 PSj{)  wq, -i  ?j . ln( - pX J4) (Y« - EY<) 


+ EE^  JYm-EY„)T(Inm-Ps.JXon,(Cj_rCj  )xT't(IlirP5.t)(Ye-EY{) 


1 < m ^ < k 

(#i) 


•jm 


By  part  i)  of  Lemma  3.3.3  we  see  that 


EE  (Ym-EY.)I(I„111-Px.  )Xom<Yj.Xi(I„  -Px.,)(Yf-EV4)  = 0(p5) 


1 < m ^ £ < k 


-jm 


£ tje' 


By  part  ii)  of  Lemma  3.3.3  we  see  that 


ZttcViFU n Px. #)MCj,-iCj.)3S(I n£-P X..)a£-EY£)  = 0(pS) 


4£  - j£ 


£ *j£' 


By  part  iii)  of  lemma  3.3.3  we  see  that 


EE  (YB-EYm)T(i„B-px  )WC],.rc].)xT(i  p )(y<-ey<)  = o(p§) 

1 < m ?£  £ < k ~jm  J J t ~jt 

(#i) 


Thus 


119 


SSE 


j+l-SSEj+l,-i  = (Yj -EYi)T(I„i-Pxj.)(Yi -EYi)  +0(pS). 


Now  by  Chebychev’s  inequality,  for  any  e > 0 


K ! ■?|k(Vi-EYi)T(In.-Px..)(Yi-EYi)>t^) 

S.E^Yi-EYi^d  -PX  XYj-EYj) 
1 = 1£  K 1 -J1 

= 0(1), 

by  Al.  This  proves  the  lemma. 


Theorem  3.3.5  Assume  that  g is  a real  valued  function  of  q+1 
variables,  g:Rfl+^R,  with  continuous  second  partial  derivatives  which 
are  continuous  in  some  neighborhood  of  R*<J2.  Again,  under  model 
(3.1.1)  with  assumptions  Al , A2  and  A3 


^gjack(Ti’"-’Tq+l)_g(-*-2)}  ^ N(0,  X XgigjVij 

J M i=l  j=l  J J 


q+1  q+1 


, dg 

where  gj 


fe*?2 


for  i = l,...,q+l  and  v — is  the  (i,j)  e 


lement  of  V. 


Proof . This  will  follow  from  Theorem  3.3.3  if  it  can  be  shown  that 


^k(sjack(Tl’"-’Tq+l)  g(Ti, jack’’”’  Tq+l,jack^)  °* 


To  do  this  a Taylor  expansion  of  both  functions  will  be  used.  In  a 


120 


fashion  similar  to  the  proof  of  Theorem  3.3.3  expand 

g(Ti,  jack v»T  i jacj.)  in  a Taylor  expansion  in  a neighborhood  of 

(Tl,...,Tq+1)T  which  gives 

, jack,-"’^q+l , jack)  ~ g(^l’",’^q+l) 


+ I1^(T..  — Tq+iKTi.jack-Ti) 


(3.3.7) 

where  e is  on  a line  segment  between  (Tj  jacjjf-jT  jacj.)  and 

(T,,...,Tq+1). 

Next  observe  that  expanding  g(Tj  _.,...,Tq+1  _j)  in  a neighborhood  of 
(Tl ’•••’Tq+l )T»  for  i =l,...,k  gives 


gjack(Ti’-",Tq+l) 


kS(T.--Tq+l)-¥i£s(T.,-i'-’Tq+l,-i) 


V,-i)-*(T- Vi» 


k q+i 


S(T> v.)(h,-i-h 


~Ti)] 


..  , k q+lq+1  a2 

+ ^i?i  j?!  “Tj)(T«,-i  - 1) 


121 


:*<T> Tq+1)  + ?)[^(Tl’-’Tq+1)(Tj,>ck-Tj)] 


. 1 k q+lq+1  d2 

+ Z Ea^<iKTJ,-i-Tj)(T£,-i-T() 


(3.3.8) 


A comparison  of  (3.3.7)  and  (3.3.8)  yields 


®jack^i  ’•••’Tq+l)  ®(^i,jack’  '“’^q+1 , jack-^ 


, k 1 k q+lq+1  d2 

,k-1-£  £ SlSnfea)(Tj,.i-Tj)(T4i.i-Tt) 


2 k 


i=l  j^fcl^j1 


+ f|  ^(sHq.jack-TiK^Jack-T,) 


(3.3.9) 


Since  the  second  partials  of  g are  continuous  and  the  maximum 

difference  of  T--T-  • converges  to  0,  for  j=l,...,q+l,  then  the  second 

J J 9 ~ 1 

partials  of  g evaluated  at  A are  simultaneously  bounded.  Thus 
according  to  Theorem  3.3.4 


*¥&  ££jc^4)(Tj.-i-Tj><Vi-T*>  4 °-  <3-3-10> 


k q+lq+1  *2 


Again,  applying  the  continuity  of  the  second  partials  of  g, 
together  with  the  results  of  Lemmas  3.3.3  and  2.4.5  shows  that 


^{^?1  Si  jack-Ti)(T£, jack-T£)}  ^ °-  (3.3.11) 


Finally,  combining  (3.3.10)  and  (3.3.11)  implies  that 


122 


^Sjack(Tl’ 


•’Tq+l)  S^Tl,  jack’"',Tq+l,  jack^ 


which  proves  the  theorem. 


CHAPTER  FOUR 

HIERARCHICAL  BAYES  ESTIMATION  OF  THE  VARIANCE  RATIO 


4. 1 Introduction 

In  this  chapter,  we  will  consider  the  problem  of  estimating  the 
the  ratio  of  the  variance  components,  a monotonic  function  of  what  is 
often  referred  to  as  the  ’’heritability  r'atio”,  in  a balanced  one-way 
Anova  model  with  covariates.  As  mentioned  in  Chapter  2 the  usual 
Anova  estimators  have  the  possibility  of  producing  negative  estimates 
of  variance.  The  positive  part  Anova,  the  maximum  likelihood  (ML), 
and  the  restricted  maximum  likelihood  (REML)  estimators,  the  latter 
two  being  derived  under  the  normal  model,  are  non-smooth  and  hence 
are  inadmissible  under  any  smooth  loss,  for  example  squared  error. 
Furthermore,  the  performance  of  such  estimators  under  non-normal 
models  is  open  to  question.  Thus,  the  question  that  naturally  arises 
is  whether  there  exists  a smooth  non-negative  estimator  of  the 
variance  ratio  which  performs  satisfactorily  for  small,  moderate,  and 
large  sample  sizes  and  for  a wide  variety  of  distributions  including, 
but  not  limited  to  the  normal. 

We  propose  in  the  present  work  a Bayes  estimator  of  the  variance 
ratio  derived  under  a hierarchical  normal  linear  model  for  balanced 
data.  The  Bayes  estimator  of  the  variance  ratio  is  derived  under  a 


123 


124 


hierarchical  model  given  in  Datta  and  Ghosh  (1989).  This  model  uses 
a prior  different  from  the  one  in  Portnoy  (1971).  In  particular,  the 
prior  proposed  in  Datta  and  Ghosh  (1989)  does  not  depend  on  the 
sample  size,  as  opposed  to  the  one  given  in  Portnoy  (1971).  The 
proposed  Bayes  estimators  are  quite  satisfactory  asymptotically.  In 
Section  2 consistency  as  well  as  asymptotic  normality  of  these 
estimators  are  derived  under  certain  mild  moment  conditions,  without 
requiring  any  distributional  assumption  of  the  observations. 

In  order  to  construct  an  asymptotic  confidence  interval  centered 
at  the  Bayes  estimator,  we  derive  in  Section  3,  a jackknife  estimator 
of  the  variance  of  the  asymptotic  distribution  of  the  Bayes  estimator 
after  suitable  normalization.  This  jackknifed  estimator  is  shown  to 
converge  in  probability  to  the  true  asymptotic  variance. 

The  possibility  of  jackknifing  Portnoy’s  Bayes  estimate  was 
mentioned  in  Arvesen  (1969),  although  he  did  not  carry  out  the  actual 
computations.  Also  the  present  work  considers  an  Anova  model  with 
covariates  which  is  not  considered  at  all  in  Arvesen  (1969)  or 
Arvesen  and  Layard  (1975). 

The  present  chapter  will  proceed  as  follows.  The  Hierarchical 
Bayes  estimator  of  the  variance  ratio  will  be  derived  under  a model 
given  in  Datta  and  Ghosh  (1989).  Under  mild  conditions,  this 
estimator  will  be  shown  to  be  consistent.  Next,  a central  limit 
theorem  for  this  estimator  is  derived.  For  proving  consistency  or  the 
central  limit  theorem  the  normality  assumption  will  not  be  required. 
The  jackknifed  version  of  this  estimator  will  be  shown  to  be 


125 


asymptotically  equivalent  to  the  Hierarchical  Bayes  estimator.  For 
the  purpose  of  writing  confidence  sets  an  estimate  of  variance  of  the 
asymptotic  distribution  is  necessary.  The  jackknife  estimate  of  the 
variance  of  the  asymptotic  distribution  is  derived  and  is  shown  to 
converge  in  probability  to  the  parameter  it  estimates.. 

4.2  The  Derivation  of  the  Bayes  Estimator 


Consider  the  following  hierarchical  model  as  presented  in  Datta 
and  Ghosh  (1989). 

I.  Conditional  on  0 = 6 , B = b,  R = r and  A = A,  Yj,...,Yj_  and  S are 

mutually  independent  with  Y = (Yj  (nr)-1Ifc)  and  S 

- r_1v2 

r X (n-l)fc' 

II.  Conditional  on  B = b,  R = r,  and  A = A,  0 - Nfc(Xb,  (Ar)_1Ifc), 
where  X is  kxp0,  it  is  assumed  that  k > p0,  and  rank(X)  = p0. 

III.  B,  R and  Z = AR  are  marginally  mutually  independent  with  B~ 

o iff  -1 

uniform(IRp) , Z has  pdf  f(z)~z  2 and  R has  pdf  f (r)  oc  r2  ° , 

where  g0  satisfies  (n-l)k  + g0  > 0.  Thus  B,  R and  Z all  have 
improper  prior  pdf’s. 

Stages  I and  II  of  the  above  hierarchical  model  can  be 
identified  as  a balanced  mixed  effects  model.  To  see  this  let 

Yij  =x|b  + vi +eij  for  i = 1,  • • • ,k  and  j = 1,  • • • ,n  (4.2.1) 


126 


In  the  above  xl,...,xk  are  known  design  vectors,  b is  the  vector 
of  regression  coefficients,  v^’s  and  e^j’s  are  mutually  independent 
with  v^’s  i.i.d.  N(0,<xv2)  and  ejj’s  i - i . d . N(0,<re2),  where  <r2  = 
(Ar)_1  and  <r2  = r-1.  The  minimal  sufficient  statistic  for  this 
problem  is  (Y,,...,  Yfc,  S)  where  Y-  = Y.  • and  S = £ )2, 

j = l J i = lj=l 

Then  (Ya,...,  Yfc,  S)  has  a distribution  as  specified  in  I and  II.  Note 

that  the  balanced  random  effects  one-way  Anova  model  is  a special 

case  of  this  model  where  p0  = 1 and  = 1 for  all  i. 

We  are  interested  in  finding  the  posterior  distribution  of  the 

variance  ratio  ^ x — = A-1,  and  more  particularly,  the  posterior 

R 

mean  of  A-1.  From  Datta  and  Ghosh  (1989)  we  obtain  the  posterior 
distribution  of  A given  Y = y and  S = s as 


f(A|y,s)<x(-^(fe  Po)A-2[s+^yT(I-Px)y]  K 


(4.2.2) 


where  = X(X^X)-1Xr,  and  </>  = nk  - p0  — 2 + g0. 

Let  U = a+\‘  Then,  from  (4.2.2)  it  follows  that  the  posterior 
pdf  of  U given  Y = y and  S = s is 


f(u|y,s)  oc  u2^  P°  ^ a(l+uZ)  ^ , 


-ht> 


(4.2.3) 


where  Z = nY^( I — P^)Y/S  is  a multiple  of  the  usual  F statistic.  If 
k > p0  + 4 the  Bayes  estimate  of  A-1  is  obtained  as 
eB  = E(A_1|y>s) 

= wE(U ~ 1 — 1 | y,  s),  (4.2.4) 


127 


where 


E(U-1|y*s) 


1 Mk~Po~  2>~2 


J u2 
0 


-U 


( 1 + uZ)  2 du 


j u2(fc  po  2)  1 

0 


-U 


(4.2.5) 


( 1 + uZ)  2 du 


Next  we  study  the  frequentist  properties  of  the  above  Bayes 
estimator  when  the  treatment  and  error  variances,  cr2  and  <7g  are  fixed 
respectively  at  o-20  and  o-|0.  With  this  end,  writing  tj  = |(k-p0-2)  and 
t2  = ^((n-l)k  + g0) , so  that  <f>  = tj+tg,  one  can  express  E(U_1|y,s),  as 
given  in  (4.2.5)  by 


J utl_2(l+uZ)“(tl+t2)du 
0 

J utl_1(l  + uZ)_(tl+t2)du 


= Z 


r ( uZ  i ^2  z 

A ^ 1+uZ2 


1+uZ2  (1+uZ)5 


du 


0 Cl+uZ;  ( 1+uZ)2 


= z 


l+Z  t _2  t 
j w 1 (1-w)  2dw 

0 


l+Z  ^ f . 

j w^'^l-w^^dw 

0 


128 


tn-1 


= Z 


l+Z  + h 

w=0  o 


l+Z  •(-  _ i t — i 
j w 1 (1-w)  2 dw 

0 


Z^ 


tx-l 


z + 


(ViKi+z)^'1 

z 

l+Z  +-  _ i t — i 
j w 1 (1-w)  2 dw 

0 


(4.2.6) 


To  study  the  asymptotic  behavior  of  eg,  we  use  the 
representation  (4.2.6).  We  also  dispense  with  the  normality 
assumption,  and  accordingly  the  independence  of  S with  (Yj,...,  Yfc) . 

In  the  remainder  of  this  section  we  consider  the  mixed  model  given  in 
(4.2.1)  with  b = b0,  cr2  = 0-2o,  and  <r|  = <r|0.  The  normality  assumption 
is  dispensed  with,  but  the  assumption  of  independence  of  the  v^’s  and 
the  e^j’s  is  retained.  Such  a model  will  henceforth  be  referred  to 
as  model  M.  The  first  theorem  of  this  section  proves  the  consistency 

0.2 

of  eg(Y,S)  as  an  estimator  of  — j under  certain  moment  assumptions  on 

ae 

the  Yj  j ’ s . 

Theorem  4.2.1.  Assume  model  M with  sup  max  E|Y-  • — EY-  • |2  + ^<oo, 

,i>l  l<j<n  1 J ' 

P C* 

for  some  positive  6.  Then  e„  -Y  as  k — * oo. 

B 4 


Proof . 

Write  SSEj  = nY^( I — P^)Y.  Since  EY  = XbQ  and  (I  — P^)X  = 0,  one  gets 
SSEj  = n(Y-EY)T(I-Px)(Y-EY) 


129 


= n 


E(Y£-EY£)a(l-ctt) 


— n 


EE  (y£-ey£)(y  ,-ey  ,)c 

1 < i ± t'  < k 1 i L 


(4.2.7) 


where  c££,  = x£:r(XTX)  1x£/.  Using  the  independence  of  the  Y£’s  one 
obtains 


EE 

1 < L ± l'  < k 


(Y£-EY£)(Y£/ 


2 


= 2.  EE  < ^(Yj-EV’E^-EYj,) 

k k 

< 2(<rvo  + TT^eo)  E E ccc'c['c 


= 2(<t^0  + Mo)2  tr(IpQ) 


- On  l /r2  i 1_2  \2 

— 2p0  ( (Tv0  + jfCT  eQ  J 


= 0(1).  (4.2.8) 

Hence  I £ £ (Y£  - EY£)  (Y  , - EY,,)c„,  £ 0,  as  k - oo.  (4.2.9) 

1 <£#£'<  k e 

Denote  f £ = 1 — c££  = 1 — x£r(XrX)  ~ *x£  so  clearly  0 < f £ < 1 . Also  since 

sup  max  E| Y-  • - EY  • • | 2 + ^ < oo 
i > 1 l<J<n  1 J 


for  some  6>0,  then 


130 


sup  E | Y - -EY-  |2  + ,5<oo 
i > 1 

for  this  same  6,  where  Y-  = . This  verifies  Markov’s  condition  for 

the  weak  law  of  large  numbers.  Hence, 

— EYg)2- (<Tv0  + i7°i0)2}  ^0as  k— »oo.  (4.2.10) 

k 

Since  £ f * = tr(I  — Py)  = k-p0  it  follows  from  (4.2.10)  that 
1=1  ~ ~~ 

^Eif£(Y£-EY£>2  ^ Ov0  + W<4o)2  as  k— °°-  (4.2.11) 

Thus  from  (4.2.7),  (4.2.9),  and  (4.2.11), 


ifSSEj  £ (<r2o  + ^|o)2  as  k-^oo. 


(4.2.12) 


Next,  using  the  fact  that  £ (Y-  •—  Y-  )2  are  independent  wi 

• A 1 J 1 « 


th 


j=i 


1+£ 

mean  (n-l)<Tg0,  and  a Iso  that  supE|  £ (Y.  ,-Y-  )2  | 2 <00,  for  the 

i>l  j=l  1J  x* 

above  6,  one  gets 


1 c P 2 1 

j—[jkS  ^eo  as  k-°°- 


Hence , 


(fc-1)  SSEj  p x 


1 ( n -5^  + 1 ) as  k— >oc . 
n - 1 ) ' 


(n-l)k  S 
_ 1 


(n  - 1) 


'eo 


Also,  since  tj  = ±(k  - p0  - 2)  and  t2  = i((n-l)k  + g0) 


(4.2.13) 


131 


2 

7 ^-pr  Z £ n-sp+1  as  k-+oo.  (4.2.14) 

(tl-1)  o% o 

Now  following  the  arguments  of  Datta  and  Ghosh  (1989),  if  h(k) 
denotes  the  reciprocal  of  the  second  term  in  the  right  hand  side  of 
(4.2.6),  then 

P(-jg-log(h(k) ) > 0)  — ► 1 as  k— >oo.  (4.2.15) 

Thus  h-1(k)  converges  to  zero  in  probability  at  an  exponential  rate. 
Now  using  (4.2.4),  (4.2.6),  (4.2.13),  and  (4.2.15), 


P <r 


— =-  as  k 


00. 


The  proof  of  Theorem  4.2.1  is  complete. 

Under  assumptions  A1  and  A2  from  Chapter  2,  the  Bayes  estimator 
of  the  variance  ratio,  eg,  after  standardization,  satisfies  the 
central  limit  theorem.  Recall  the  following  notation  from  Chapter  2 
for  the  fourth  moment  of  a mean  zero  random  variable. 

E(elj)=/J4,e  for  i = and  j = 1 > ■ * • » n (4.2.16) 

E(v| ) = ^4 , v for  i = 1 , • * - , k (4.2.17) 

The  following  moment  calculations  will  be  useful, 
a)  Var[£  (Y^-Y-  )2] 

j=l  J 

= Var[£  )2] 

j=l  J 

- (n-  1,2r „ _4  \ | o(”-!)_4 

— n 1^4,6  ffeo)  ' ^ n ae0 


132 


= cr22  (say) 

which  is  a result  of  Lemma  2.2.1.  If  c.k.(e)  denotes  the  coefficient 
of  kurtosis  of  the  random  variable  e then 
^22  = geoe*k-(e)  +2(n~1)<reo- 

Since  the  coefficient  of  kurtosis  of  a normal  random  variable  is  0 
then  if  e happens  to  be  normal,  tr22  = 2(n  — 1 )<Tg0. 


b)  Recall  that  f ^ = 1 - c^  = 1 - X£T(XTX)  XX£.  Then 


VarCEciCYi-EYj)2] 

i=l 

= Ec2iVar{(Yi-EYi)2} 
i=l 

= E c2Var{(Vi  +ei>)2} 

i = l 

= Ec2iVar{v2i+2viei>+e2i} 
i = l 

= E ci{Var(v?  ) +4Var(viei  )+Var(e|  )} 
i=l 

= (E  ci)[(/i4,v_<Tto)+^Tvo4o+  ^(/i4,e_3<7eo)+4<Teo] 

= (^Eici)  (say).  (4.2.18) 


Note  however  that 


133 


1)  E c2i  = E [1  -2xeT(XTX)  -%  + (x£r(XTX)  -1x£)2] 
i=l  i=l 

2)  x£r(XTX)-xx£<l 

3)  E x£T(XTX)  _1x£  = tr|(XTX)  _1  E^x£rx£|  = tr[Ip(j]  = p0. 


Thus 


k - 2Po  < E ci  < k~Po 
i = l 

Combining  (4.2.18)  and  (4.2.19),  it  follows  that 


(4 


jtl  c£n2Var{(Y£-EY£)2}  -f  an  as  k-»oo. 


(4 


Again  using  the  notation  c.k.  to  stand  for  the  coefficient  of 
kurtosis  of  a random  variable, 

°Ti  = <Voc  • k • ( v ) + 2<tv0  + Tj-^vo^eo  4"  — S^eo0  • k • ( e ) ^ ‘ 

n n 

If  both  v and  e are  normally  distributed  then 

_ _ 0_4  . 4_2  2 ,2  4 _0/_2  .1*2  \2 

°11  ~ “^vO  'ncrvO<TeO  ' 2°e0  — ZV°V0  ' n°eOl  • 

n 


c)  Cov[fcci(Yi-EYi)a;i:J:(Yii-Yi  )2] 
i=l  i=lj=l  J 


: fecov[ci(Yi-EYi)a;{:  (YirY.  )2] 
i=l  iii  j=1  iJ  i- 


: £ ciCov[ ( V.  + e.  )a;  4 (ej  j - ej  )2] 
i=l  j=l  J 


= .EciCoV[e?>;r  (e.j-e.  )2] 
ii=l  _ j=l 

= V r Si lzll(  _ n4  \ _ q("  ~ 1)  4 \ 

E ci|  2 ^4,e  ^eo)  ^ 2 *eo| 

j — i x n 7 n s 


.2.19) 


.2.20) 


134 


(say) 


(4.2.21) 


Still  denoting  the  coefficient  of  kurtosis  by  c.k.,  we  see  that 
a12  = 2-=^  <7g0  c.k.  (e) , and  if  e is  normally  distributed  then  a12  = 0. 


Theorem  4.2.2.  Under  model  M with  assumptions  A1  from  Chapter  2 


>Ik(  7-^TTZ-(n42+l))  - N(0,<r2)  as  k-oo, 
v (ti"1!  ^eo  ' 

1 2 ( °V0  + W^eo ) nl.  (^vo+n^eo)  , 1 _ 1 (A  o oo'* 

where  a = <t22 g *71*12 6 

*eo  aeo  " a eo 


Proof . The  function  g(x,y)=y  has  continuous  second  partials  in  a 
neighborhood  of  ( <r2o  + ^eo  > <r2e0)  since  <r20  > 0 by  assumption  A1 . If  we 
simply  notice  that  SSEj  and  S correspond  to  SSEj  and  SSE2  of  Chapter  2 
and  that  the  assumptions  of  Theorem  2.3.1  are  satisfied,  then  this  is 
just  an  immediate  application  of  the  aforementioned  theorem. 


Theorem  4.2.3.  Under  model  M with  assumptions  A1  from  Chapter  2 


\|k  ( eg ^ j -i  N(0,<r2)  as  k— +00, 

where  <r2  is  defined  in  (4.2.22.) 


Proof . Since  (neg  + 1)  — 
at  an  exponential  rate, 


j- -r-Z  converges  to  zero,  in  probability, 

(ti  - 1) 

by  (4.2.14),  an  appeal  to  Slutsky’s  Theorem 


135 


and  Theorem  4.2.2  yields  the  desired  results. 


4.3  Jackknifed  Estimator  of  the  Asymptotic  Variance 


The  asymptotic  distribution  of  eg  derived  in  the  previous 
section  cannot  be  used  for  the  construction  of  confidence  intervals 


for  the  ratio,  <7y/cr|,  unless  a consistent  estimate  of  the  variance  of 
the  asymptotic  distribution  is  found.  In  this  section  a jackknifed 
estimate  of  the  asymptotic  variance  is  shown  to  converge  in 
probability  to  the  true  asymptotic  variance. 

The  hierarchical  Bayes  (HB)  estimator  of  the  variance  ratio  is 
eg  = ^ {E[U-1 1 Y,S]  — 1} . However,  in  terms  of  the  jackknife  it  suffices 
to  consider  7 = neg  + l since  the  jackknife  is  a linear  operator.  Thus 
we  will  jackknife  7.  Recall  the  delete  notation  from  Chapter  2.  Let 


ti(-i)  = j(k-l-p0-2)  and  t2(-«)  = |((n-l)(k-l)  + So)  are  independent  of 
the  particular  i.  Thus 


(4.3.1) 


(tl(-»)_1)(1+Z(-«)) 
% " a 


tl(  - .)+t2(  - .)  - 1 


0 


where 


136 


The  pseudovalues  are  defined  by  = k7  — (k-1 )7^ _ ,j  and  the  jackknife 
estimator  is  defined  by 

^ jack  — k .^^i 


= i|1(ktl^z  + kT-(k-1)t^^Z(-.)-(k-1>T(-o 


i,  ^27  k-1  f ^2(_l)  7 , 1 nn  k-1  y' t 

ktr^rz-Ti51i^-7TTz(-o+kT  k £T(-) 


(t!  -lZ)jack  + Tjack‘ 


(4.3.2) 


Next  it  will  be  shown  that  the  un jackknifed  and  jackknifed 
estimators  are  asymptotically  equivalent  but  first  a technical  lemma 
is  needed. 


Lemma  4.3.1  Under  model  M with  assumptions  A1  and  A2  from  Chapter  2, 


(i)  max  |SSE1-SSE1(_,.)|  =0p(^k); 

( i i ) max  | S - S(  _ t)  | = Op(vfk) . 

Proof . (i)  Using  representations  (4.2.7)  and  (4.2.13)  from  Chapter  2 
we  see  that 

SSE1-SSE1(_,.)  = (Yi-EYi)2ci 


(4.3.3) 

(4.3.4) 


+ EE 

1 < i ^ < k 


(Yi-EYiKY£-EY£)ci£ 


137 


+ £ £ 

1 < i ^£<k 


(Yi-EY.)(Y|-EY£)(ci£-ci|(_0) 


<+£.  <k(Y£-EY£)(Y£-EY£)(f£-f£(  _0),  (4.3.5) 


where  ^ = 1 - x£  (X^_ _ ^)  axg.  An  inspection  of  Lemma  2.4.4 

reveals  that  by  part  i) 

£ £ (Yi-EYi)(Y£-EY£)ci£  = Op(l) 

1 < i # « < k 

by  part  ii) 


i £££<k(Yi-EYi)(Y£-EY£)(ci£-ci£(_.))=Op(l) 


by  part  iii) 


< t £ . < k(Y£  - EY£)  ( Y£  - EYg)  (f  £ - f £(  _ ,))  = Op(>fk) 


Thus 


max  ISSE,— SSE,,  •> 
l<i<k  1 1(_,) 


= max 
1 < i < k 


(Yi-EYi)2ci+op(^k). 


Now  by  Chebychev’s  inequality,  for  any  e>0, 


max 
< i < k 


(Yi-EYi)2ci>e^  ) < .S^E{(Yi-EYi)2ci)2  = 0(1)» 


where  the  last  equality  follows  from  (4.2.19). 


(ii)  Note  that  under  model  M, 


S-S 


,.i)=E(vij-vi.)1 

J = 1 


so  for  any  f > 0 , 


pf  max  | S - S,  • J 
' 1 < i < k (_,) 


— EYi  j)2)2  = 0( 1 ) 


(4.3.6) 


This  proves  the  lemma. 


138 


Theorem  4.3.1.  Under  model  M from  above  with  assumptions  A1  and  A2 


U 

from  Chapter  2 ^(7 - 7 jack^  — * 0 as  k —>00 


(4.3.7) 


Proof . It  has  already  been  shown  that  T — * 0 at  an  exponential  rate 

k p . 

by  (4.2.15).  Also  observe  that  T-££T(_i)  — 0 at  an  exponential  rate 


i=l 


and  therefore  -jk(T  - Tjacj<)  = vfk(k-l ) [T £ T^  _ — + 0 

theorem  will  follow  if  it  can  be  shown  that  >fk[^  2^Z— ^ 2-i^)jack^ 


Thus  the 
to  „\  P 


0. 


A straightforward  algebraic  calculation  yields, 


2(-,)  t2  _ (n-l)p0+g0+8(n-l)  _ n/  t , 
- i ) (ti-l)(ti(_i)  -1)  k2 


"1(  - 0 


so  that 


( t2  Z) 

1 -k-ll 

< ^(-.j  t2  \ 

Uj-i^ 

'jack  k 1 

^1(  - i)  -1 

p 

Thus  it  is  sufficient  to  show  that  ^Jk[Z-(Z) jacjc]  -*  00  as  k -•  ®. 

However,  since  Z = ^Z,  this  is  an  immediate  application  of  Lemma 

2.4.4,  Lemma  2.4.5,  and  the  fact  that  |-S  £ (n-l)<7g0>0  as  k— >oo. 

Therefore  the  theorem  is  proven. 

Recall  that  the  definition,  from  Chapter  2,  of  the  jackknife 
~ 1 k 

variance  estimator  is  Var (7)  = (jj-  7jack)  * The  nex^  theorem 

shows  that  this  is  a consistent  estimator  of  the  variance  of  the 
asymptotic  distribution. 


Theorem  4.3.2.  Under  model  M from  above  with  assumptions  A1  and  A2 


139 


from  Chapter  2 Var(7)  ^ J2  as  k — >oo,  where  <r2  is 
defined  in  (4.2.22). 

Proof . First  observe  that 


var~(7)  =kTi.E  (7,-Tjack)2 

= JT.E(k7-(k-i»(_,)4|i(kr-(k-iH(_jl))2 


= (k-1)|1(^^Tz'  - 0 + T(  - 0 - ^ c t^z<  - 0 + T<  - i> 


r t2(  - 0 


tJ,-° 


i=l\  !( — *) 


j=l  !(-•)' 


Tz(  - j))2 


+ - i,  [CTZ«  - »][T'  ■ - '>  - &T<  - 

+(t(_0-i|iT(.„)2  ) 

(k-D.E^.^lET,.,,)2  £ o 


As  before 


(4.3.8) 


as  k-*oo.  Next  we  will 


consider  the  first  term  in  (4.3.8).  A second  order  Taylor  expansion 
of  Z around  (^SSEj^S)  will  be  employed. 


(t-i)E(tti('‘).,Z|-,r^rhjTz[-i)) 

i=lvti(-0  1 y ’ kj=iti(-»)  1 1 


k t. 


140 


( k_1  yjkSz<  -•>  - z + z - i|tz(  - *)2 


y 

k vtl(  - «)  1 ’ 


iE((Z(.i|-Z)!  + 2(2(_irZ)(Z-ljCZ(_i))  +(z-ljc]2(_j))2) 


l(-«) 


E(Z(-0-Z)J  + .E2(Z(_0-Z)(Z-l^iZ(_yp+k(z-lEZ(-i))2)  (4-3.9) 


The  last  term  on  the  right  hand  side  of  (4.3.9)  will  be  shown  to 
converge  to  zero  in  probability. 

klz-jEZ,-;,)2 

=(.|z-z( 

/^SSE,  SSE1(_0. 

>1  s 5(  - 0 ' 

_/k  SSEj  SSEi(-»)  | SSEi(-i)  SSEi(-i 


=(£ 

= 

j=r  * 


+- 


SSE 


i(-0 


y 


■(-•I 


+ £ 2^SSE1  SSE1(  _ ,)ySSE1(  _ t)  SSE1(  _ 


k /SSEjj _ ^ SSE1(_-A2 

y v^rj 


+ e(: 

j=iv 


(4.3.10) 


141 


(4.3.11) 


(4.3.12) 


Recall  that  Theorem  2.4.5  implies  that  the  two  terms  in  braces  in 

(4.3.12)  converge  in  probability  to  finite  numbers  which  implies  that 
the  first  and  third  term  on  the  right  hand  side  of  (4.3.12)  converge 
in  probability  to  0.  Now  an  appeal  to  Cauchy  Schwarz  shows  that  the 
entire  right  hand  side  of  (4.3.12)  converges  in  probability  to  0. 

Next  we  show  that  the  first  term  on  the  right  hand  side  of 
(4.3.9)  converges  in  probability  to  <r2.  Here  is  where  the  Taylor 
expansion  is  used  on  g(x,y )=y  in  a neighborhood  of  (isSEx,^S). 

fc.E  (Z,  - Z)2  = k n2£  (If^SSE,  ,lS)i(SSE1(  _ „ -SSE,) 

+ |(JssE1,ls^(S(.,rS) 


142 

+ 50(A.^.)^(SSEi(-O-SSEi)2 

+ 0(  a, )^(SSE1(  _ 0 - SSEX ) (S(  _ 0 - S) 

+50Ca.^,)^(s(-o-s)2)2  (4-3-13) 


where  A^  is  on  a line  segment  between  SSE^_t)  and  SSEj  and  n j is 

between  S(_f\  and  S.  It  will  be  shown  that  the  terms  involving  the 

second  partials  all  converge  to  zero  in  probability.  The  easiest 

52g 


term  to  handle  is  — ^ since  it  is  identically  zero.  Now  — — so 
dx  u y 

that 


d g _ 2x 

y3 


ki5l^^y^Ai,/ii)  k2^S(-<)  S')2^2  _ k3 i?i(i)2  (S(-«)'S) 


k /A- 

A 


= oP(i)i.E(s(_jrs)4 

K 1 = 1 


according  to  Lemma  4.3.1.  Then  by  (4.3.6)  this  last  term  is  0(k-1). 
Next  the  mixed  partial  term  is  handled  by  Cauchy  Schwarz.  Thus  we 
are  left  with 

k i?1(s(iSSEl,is)i(SSEl(-irSSEi)+^iSSEi’Es)E(s(-rs))2 

= ©^(SSE^-SSE,)2 

i/k3SSE,\  k 

~ 2a^)  ^ (SSE'(  - ■•)  - SSE>  > <si  - o-s> 


143 


+ (^fP1) 2 J .E  ( S( _ S )2  (4.3.14) 

Combining  Theorem  2.4.4  and  Theorem  4.2.1  shows  that  the  left  hand 
side  of  (4.3.14)  converges  to  n<r2  + 1 . Since  again  the  cross  product 
term  can  be  treated  by  Cauchy  Schwarz  this  concludes  the  theorem. 


CHAPTER  FIVE 
RESULTS  OF  SIMULATIONS 

5 . 1 Introduction 


The  present  chapter  contains  the  results  of  computer  simulations 
which  are  presented  to  help  validate  the  results  of  the  work  of  prior 
chapters  as  well  as  give  an  indication  as  to  the  small  and  medium 
sample  properties  of  the  previous  work.  There  are  two  basic  types  of 
simulations  which  were  run.  The  first  type  of  simulation  constructs 
univariate  confidence  sets  for  the  variance  ratio,  in  an  unbalanced 
model  validating  the  asymptotic  convergence  of  the  jackknife.  These 
simulations  are  similar  to,  and  may  be  thought  of  as  an  extension  of, 
simulations  given  in  Arvesen  (1969),  Arvesen  and  Layard  (1975),  and 
Prasad  and  Rao  (1990).  The  second  type  of  simulation  is  construction 
of  univariate  confidence  sets  for  the  variance  ratio  in  a balanced 
model  with  two  random  components.  The  purpose  of  this  is  to 
demonstrate  the  effectiveness  of  the  HB  estimator  for  small  samples. 

5 . 2 The  Simulation  Results  for  the  Unbalanced  Model 

The  results  of  the  simulation  in  the  unbalanced  model  will  be 
presented  in  this  section.  The  purpose  is  to  study  the  effectiveness 


144 


145 


of  the  jackknifed  confidence  interval  for  the  variance  ratio,  in  an 
unbalanced  model  as  the  sample  size  increases.  The  hierarchical 
Bayes  estimator  was  not  derived  in  the  unbalanced  case  so  it  is  not 
included  in  these  calculations.  Furthermore  the  maximum  likelihood 
must  be  found  by  iterative  methods  in  the  unbalanced  case  (see  for 
example  Harville  (1977))  so  it  also  is  not  included  in  this  first  set 
of  simulations.  Three  estimators  were  included  in  this  first  set  of 
simulations.  The  Henderson  III  estimate  was  the  basis  of  all  the 
calculations.  A jackknifed  version  of  this  estimate  was  calculated 
and  then  using  the  jackknife  estimate  of  variance,  a 95%  confidence 
set  for  the  ratio  was  calculated.  Secondly,  it  has  been  conjectured 
by  some,  that  for  variance  ratios,  the  log  of  the  estimate  is  a 
better  function  to  jackknife  (see  for  example  Prasad  And  Rao  (1990)), 
so  a jackknife  of  the  log  of  the  ratio  was  used  to  write  95% 
confidence  intervals.  Finally,  using  the  estimate  of  variance  from 
the  jackknife  to  produce  an  estimate  of  the  variance  of  the  Henderson 
III  estimator,  95%  confidence  intervals  based  on  the  un jackknifed 
Henderson  III  estimator  were  produced. 

These  simulations  were  done  for  three  distributions,  namely 
uniform,  normal,  and  double  exponential.  These  three  distributions 
represent  light,  medium  and  heavy  tailed  distributions  and  have  been 
used  by  most  past  authors,  such  as  Arvesen  (1969),  Arvesen  and  Layard 
(1975),  and  Prasad  and  Rao  (1990),  who  have  done  similar  types  of 
simulations.  The  distribution  of  the  two  random  components  was  taken 
to  be  the  same  except  for  the  multiple  which  produced  the  desired 


146 


variance  ratio,  i.e.,  the  pairs  of  distributions  used  were  (uniform, 
uniform);  (normal,  normal);  or  (double  exponential,  double 
exponential).  It  should  be  pointed  out  that  no  known  work  includes 
simulations  where  the  two  distributions  are  mixed.  No  mixed 
distributions  are  included  here  but  some  preliminary  work  in  this 
area  indicates  that  it  is  problematic  for  all  types  of  estimators. 

There  were  several  parameters  chosen  which  were,  again,  similar 
to  the  ones  chosen  by  Arvesen  (1969),  Arvesen  and  Layard  (1975),  and 
Prasad  and  Rao  (1990).  The  present  results,  as  well  as  those 
presented  by  the  above,  used  true  variance  ratio  values  of  1.0,  2.5, 
and  4.0.  Now  for  each  distribution  type  and  for  each  value  of  delta, 
the  variance  ratio,  the  number  of  cells  was  taken  to  start  at  twelve 
and  then  increase  by  ten  to  forty  two.  The  number  of  observations  in 
the  cells  was  taken  to  be  an  approximately  equal  number  threes, 
fours,  and  fives.  Cell  sizes  were  generated  to  be  three  then  four 
and  then  five  consecutively,  until  the  number  of  cells  was  reached. 
This  is  the  same  distribution  used  in  the  three  aforementioned 
papers . 

Once  the  value  of  delta,  the  total  number  of  cells,  the  cell 
sizes,  and  the  distributions  were  determined,  then  random  numbers 
were  generated  using  the  standard  linear  congruential  random  number 
generator.  First,  a random  number  was  generated  which  was  used  as  the 
common  component  for  the  cell.  Then,  for  each  observation  in  the 
cell,  a random  number  was  generated  which  represented  the  error  term. 
This  value  was  added  to  the  common  component  to  produce  the  random 


147 


observation.  The  covariates  were  assumed  to  be  zero  to  simplify 
calculations.  Each  of  these  combinations  was  repeated  10,000  times 
for  Table  5.1. 

In  Table  5.1,  the  numbers  in  the  body  of  the  table  are  coverage 
probabilities,  i.e.,  the  proportion  of  times  that  the  generated 
confidence  interval  covered  the  true  value  of  the  variance  ratio. 

The  jackknifed  estimator  is  denoted  by  H3j,  the  log  of  the  jackknife 
is  denoted  by  Logj,  and  the  un jackknifed  estimator  is  denoted  by  H3. 
From  table  5.1  we  can  immediately  see  that  the  bizarre  behavior  of 
the  jackknife  of  the  log  of  the  ratio  when  the  true  ratio  is  1 makes 
it  totally  unacceptable.  However,  in  a situation  where  the  ratio  is 
known  to  be  large  and  only  a few  cells  can  be  sampled  it  may  warrant 
further  investigation. 

The  the  confidence  set  based  on  the  un jackknifed  Henderson  III 
estimator  seemed  to  perform  better  for  larger  values  of  the  ratio  but 
in  the  case  of  a true  ratio  of  1 the  estimator  did  not  perform  well. 
It  is  clear  from  the  tables  that  even  for  as  many  as  forty  cells  the 
confidence  interval  is  really  not  performing  satisfactorily.  The 
confidence  interval  based  on  the  jackknifed  estimator  shows  clear 
improvement  over  the  un jackknifed  Henderson  III  interval  with  a 
fairly  consistent  performance  for  all  values  of  the  ratio  and  clearly 
increasing  coverage  probability  as  the  number  of  cells  increases. 

The  confidence  interval  based  on  the  jackknifed  estimator  is  an 
improvement  over  the  un jackknifed  estimator  but  from  the  table  we  see 
that  for  small  sample  sizes  there  is  still  room  for  improvement. 


148 


5 . 3 The  Simulation  Results  for  the  Balanced  Model 

In  this  section  several  estimators  are  compared  for  some  small 
cell  sizes  in  a balanced  model  with  two  random  components.  Again  the 
parameter  of  interest  is  the  ratio  of  the  variance  components.  The 
purpose  of  this  set  of  simulations  is  to  show  the  superiority  of  the 
HB  estimator  in  producing  confidence  intervals  for  this  ratio.  The 
hierarchical  Bayes  (HB)  estimator  is  compared  to  the  Henderson  III 
(H3)  estimator,  the  maximum  likelihood  (ML)  estimator,  and  the 
restricted  maximum  likelihood  (REML)  estimator.  Jackknifed  versions 
of  all  estimators  are  used.  There  has  been  some  previous  work  in 
simulation  of  some  of  these  estimators.  For  the  H3,  ML  and  REML  see, 
for  example,  Corbeil  and  Searle  (1976).  For  the  jackknifed  version 
of  H3  see  Prasad  and  Rao  (1990).  For  the  HB  the  only  known  relevant 
works  are  Datta  and  Ghosh  (1989)  and  Datta  (1990). 

As  before  the  distributions  used  are  the  uniform,  normal,  and 
double  exponential.  The  number  of  observations  per  cell  is 
arbitrarily  set  at  four,  but  note  that  this  is  the  average  cell  size 
from  the  previous  set  of  simulations.  Again,  values  of  1.0,  2.5,  and 
4.0  are  used  for  the  true  value  of  the  variance  ratio.  In  the  prior 
distribution  for  the  HB  estimator  the  value  of  g0  was  arbitrarily  set 
to  0.  For  each  of  these  combinations  of  values,  independent  random 
numbers  were  generated  for  each  distribution  using  the  random  number 
generator  in  Mathematica.  First,  a random  number  was  generated  which 
was  used  as  the  common  component  for  the  cell.  Then,  for  each 


149 


observation  in  the  cell,  a random  number  was  generated  which 
represented  the  error  term.  This  value  was  added  to  the  common 
component  to  produce  the  random  observation.  The  covariates  were 
assumed  to  be  zero  to  simplify  calculations.  Each  of  these 
combinations  was  repeated  1,000  times  for  Table  5.2.  Then  the 
integration  facility  of  Mathematica  was  used  to  calculate  the  HB 
estimator. 

Referring  to  Table  5.2  again,  the  number  in  the  body  of  the 
table  is  the  coverage  probability  of  the  confidence  interval.  Ve  see 
that  the  confidence  interval  based  on  the  H3  estimator  performs  as 
expected  from  the  previous  simulation.  The  REML  estimator  is  almost 
exactly  the  same  which  is  to  be  expected  since  in  the  balanced 
situation  REML  is  really  the  positive  part  of  H3.  As  an  aside  it 
should  be  mentioned  that  the  number  of  times  that  H3  produced  a 
negative  estimate  was  kept  track  of.  For  ratio  values  of  2.5  and  4.0 
there  was  never  more  than  one  negative  estimate  per  one  thousand 
times  in  any  distribution.  For  a ratio  of  one  and  eight  cells  there 
were  never  more  than  twenty  four  negative  estimates  per  one  thousand 
trials.  The  ML  is  seen  to  perform  poorest  of  any  confidence 
interval.  In  referring  to  the  review  paper  by  Harville  (1976)  we  see 
that  the  ML  estimator  of  variance  can  be  seriously  downward  biased  if 
the  number  of  degrees  of  freedom  is  sufficiently  small  so  this  result 
is  not  unexpected.  The  HB  confidence  interval,  on  the  other  hand, 
performs  very  well  for  all  distributions. 


150 


Table  5.1  Coverage  probabilities 
The  nominal  confidence  coefficient  is  0.95. 
Standard  errors  are  in  parentheses. 

cell  sizes 


estimator  ratio  12  22  32  42 


normal 


H3j 


Log  j 


H3 


1.00 

0.8585 

0.8952 

0.9090 

0.9175 

(0.0034) 

(0.0030) 

(0.0028) 

(0.0027) 

2.50 

0.8635 

0.8930 

0.9040 

0.9165 

(0.0034) 

(0.0030) 

(0.0029) 

(0.0027) 

4.00 

0.8517 

0.8925 

0.9090 

0 . 9230 

(0.0035) 

(0.0030) 

(0.0028) 

(0.0026) 

1.00 

0.6717 

0.3745 

0.1610 

0.0707 

(0.0046) 

(0.0048) 

(0.0036) 

(0.0025) 

2.50 

0.8770 

0.7867 

0.6847 

0.5870 

(0.0032) 

(0.0040) 

(0.0046) 

(0.0049) 

4.00 

0.9357 

0.9160 

0.9030 

0.8895 

(0.0024) 

(0.0027) 

(0.0029) 

(0.0031) 

1.00 

0.6892 

0 . 6455 

0.5685 

0.4940 

(0.0046) 

(0.0047) 

(0.0049) 

(0.0049) 

2.50 

0.8242 

0.8212 

0.8115 

0 . 7935 

(0.0038) 

(0.0038) 

(0.0039) 

(0.0040) 

4.00 

0.8637 

0 . 8845 

0.8832 

0.8840 

(0.0034) 

(0.0031) 

(0.0032) 

(0.0032) 

151 


r 


estimator 

ratio 

12 

22 

32 

42 

1.00 

0 . 8830 

0.9115 

0.9217 

0.9272 

(0.0032) 

(0.0028) 

(0.0026) 

(0.0025) 

H3j 

2.50 

0.8937 

0.9282 

0.9382 

0 . 9390 

(0.0030) 

(0.0025) 

(0.0024) 

(0.0023) 

4.00 

0.9065 

0.9287 

0.9295 

0.9312 

(0.0029) 

(0.0025) 

(0.0025) 

(0.0025) 

1.00 

0.6147 

0.2260 

0 . 0542 

0.0067 

(0.0048) 

(0.0041) 

(0.0022) 

(0.0008) 

Log  j 

2.50 

0 . 8840 

0.7312 

0.5670 

0.4097 

(0.0032) 

(0.0044) 

(0.0049) 

(0.0049) 

4.00 

0.9512 

0.9412 

0.8997 

0.8667 

(0.0021) 

(0.0023) 

(0.0030) 

(0.0033) 

1.00 

0 . 6977 

0.5792 

0.4817 

0 . 5320 

(0.0045) 

(0.0049) 

(0.0049) 

(0.0049) 

H3 

2.50 

0 . 8445 

0 . 8285 

0.7900 

0.7320 

(0.0036) 

(0.0037) 

(0.0040) 

(0.0044) 

4.00 

0.9031 

0.9050 

0.8837 

0.8985 

(0.0029) 

(0.0029) 

(0.0032) 

(0.0030) 

1.00 

0.8870 

0.9090 

0 . 9245 

0.9257 

(0.0031) 

(0.0028) 

(0.0026) 

(0.0026) 

H3j 

2.50 

0.8970 

0.9232 

0.9397 

0.9345 

(0.0030) 

(0.0026) 

(0.0023) 

(0.0024) 

4.00 

0.9082 

0.9255 

0.9405 

0.9375 

(0.0028) 

(0.0026) 

(0.0023) 

(0.0024) 

1.00 

0.6002 

0.2170 

0.0070 

0.0055 

(0.0048) 

(0.0041) 

(0.0008) 

(0.0007) 

Log  j 

2.50 

0.8817 

0.6982 

0.3557 

0.3580 

(0.0032) 

(0.0045) 

(0.0047) 

(0.0047) 

4.00 

0.9595 

0.9295 

0 . 8587 

0.8460 

(0.0019) 

(0.0025) 

(0.0034) 

(0.0036) 

1.00 

0.6817 

0.5705 

0.3992 

0.4077 

(0.0046) 

(0.0049) 

(0.0048) 

(0.0049) 

H3 

2.50 

0 . 8420 

0.8172 

0.7105 

0.7132 

(0.0036) 

(0.0038) 

(0.0045) 

(0.0045) 

4.00 

0.9052 

0.8977 

0.8652 

0.8922 

(0.0029) 

(0.0030) 

(0.0034) 

(0.0031) 

152 


Table  5.2  Coverage  probabilities 
The  nominal  confidence  coefficient  is  0.95. 
The  standard  errors  are  in  parentheses. 

estimator 


cell  sizes 

ratio 

H3 

1.00 

0.821 

(0.012) 

8 

2.50 

0.852 

(0.011) 

4.00 

0.819 

(0.012) 

1.00 

0.868 

(0.010) 

12 

2.50 

0.854 

(0.011) 

4.00 

0.845 

(0.011) 

1.00 

0.892 

(0.010) 

16 

2.50 

0.898 

(0.010) 

4.00 

0.876 

(0.010) 

REML 

ML 

HB 

normal 

0.818 

0.788 

0.912 

(0.012) 

(0.012) 

(0.008) 

0.852 

0.817 

0.935 

(0.011) 

(0.012) 

(0.007) 

0.819 

0.782 

0.931 

(0.012) 

(0.013) 

(0.008) 

0.868 

0.852 

0.948 

(0.011) 

(0.011) 

(0.007) 

0.854 

0.838 

0.954 

(0.011) 

(0.012) 

(0.007) 

0.845 

0.828 

0.946 

(0.011) 

(0.012) 

(0.007) 

0.892 

0.874 

0.947 

(0.010) 

(0.010) 

(0.007) 

0.898 

0.884 

0.962 

(0.010) 

(0.010) 

(0.006) 

0.876 

0.861 

0.942 

(0.010) 

(0.010) 

(0.007) 

The  nominal  confidence  coefficient  is  0.95. 
The  standard  errors  are  in  parentheses. 


153 


estimator 


cell  sizes  ratio  H3  REML  ML  HB 


double  exponential 


8 


12 


16 


1.00 

0.818 

0.812 

0.783 

0.934 

(0.012) 

(0.012) 

(0.013) 

(0.008) 

2.50 

0.830 

0.830 

0.794 

0.930 

(0.012) 

(0.012) 

(0.013) 

(0.008) 

4.00 

0.829 

0.829 

0.795 

0.930 

(0.012) 

(0.012) 

(0.013) 

(0.008) 

1.00 

0.833 

0.832 

0.813 

0.905 

(0.012) 

(0.012) 

(0.012) 

(0.009) 

2.50 

0.844 

0.844 

0.820 

0.922 

(0.011) 

(0.011) 

(0.012) 

(0.008) 

4.00 

0.858 

0.858 

0.840 

0.931 

(0.011) 

(0.011) 

(0.012) 

(0.008) 

1.00 

0.877 

0.877 

0.858 

0.943 

(0.010) 

(0.010) 

(0.011) 

(0.007) 

2.50 

0.857 

0.857 

0.853 

0.940 

(0.011) 

(0.011) 

(0.011) 

(0.008) 

4.00 

0.860 

0.860 

0.847 

0.938 

(0.011) 

(0.011) 

(0.011) 

(0.008) 

The  nominal  confidence  coefficient  is  0.95. 
The  standard  errors  are  in  parentheses. 


154 


estimator 


cell  sizes  ratio 


1.00 

8 2.50 

4.00 

1.00 

12  2.50 

4.00 

1.00 
2.50 
4.00 


H3  REML 

uniform  - - 


0.847 

0.844 

(0.011) 

(0.011) 

0.874 

0.873 

(0.010) 

(0.011) 

0.873 

0.873 

(0.011) 

(0.011) 

0.895 

0.894 

(0.010) 

(0.010) 

0.906 

0.906 

(0.009) 

(0.009) 

0.915 

0.915 

(0.009) 

(0.009) 

0.898 

0.898 

(0.010) 

(0.010) 

0.920 

0.920 

(0.009) 

(0.009) 

0.921 

0.921 

ML 

HB 

0.817 

0.930 

(0.012) 

(0.008) 

0.839 

0.903 

(0.012) 

(0.009) 

0.837 

0.925 

(0.012) 

(0.008) 

0.866 

0.935 

(0.011) 

(0.008) 

0.892 

0.925 

(0.010) 

(0.008) 

0.897 

0.928 

(0.010) 

(0.008) 

0.883 

0.954 

(0.010) 

(0.007) 

0.903 

0.954 

(0.009) 

(0.007) 

0.903 

0.924 

16 


CHAPTER  SIX 

SUMMARY  AND  FUTURE  RESEARCH 
6 . 1 Summary 

In  this  dissertation  a distribution  free  approach  to 
construction  of  confidence  sets  is  employed  by  using  resampling 
techniques  in  a general  linear  model  including  covariates. 
Initially  only  models  with  two  variance  components  are  explored 
since  this  gives  rise  to  the  common  situation  of  estimating  the 
’’heritability  ratio”.  The  asymptotic  properties  of  the 
jackknifed  versions  of  the  Henderson  III  estimators  as  well  as 
the  jackknifed  estimates  of  variance  are  studied.  This  is 
extended  to  models  with  several  random  components  but  still 
including  covariates. 

In  the  situation  of  a balanced  one-way  Anova  model  with 
covariates  a hierarchical  Bayes  estimator  of  the  variance  ratio 
is  derived.  The  asymptotic  distribution  of  this  estimator  is 
derived  under  a frequentist  model.  The  jackknife  estimate  of 
variance  for  this  estimator  is  shown  to  be  consistent  and  used 
to  construct  confidence  sets  for  the  variance  ratio. 

Finally  computer  simulations  of  some  of  these  models  are 
used  to  illustrate  the  features  described  in  the  preceding 
chapters . 


155 


6.2  Future  Research 


We  have  only  dealt  with  nested  models  in  the  general 
linear  model.  An  attempt  to  deal  with  more  general  models, 
such  as  cross  classif icatory  models,  presents  a useful  problem 
of  wide  scope.  The  models  under  consideration  are  applicable 
only  when  a single  characteristic  is  measured  for  each  sampled 
unit.  It  is  often  the  case  that  sampled  units  have  more  than 
one  characteristic  measured  and  these  measurements  are 
correlated  within  a unit.  Further  work  is  needed  to  give 
multivariate  extensions  to  the  methodologies  of  this  study. 

The  HB  estimator  is  an  initial  step  in  overcoming  the 
deficiencies  in  the  usual  estimators  but  it  is  studied  only  in 
a very  special  case.  The  first  extension  needs  to  be  made  to 
the  unbalanced  case.  Almost  no  situation  encountered  in 
practice  is  balanced  and  a study  of  the  properties  of  this 
estimator  in  the  unbalanced  case  is  necessary.  Another 
direction  for  future  work  is  to  include  models  with  more 
variance  components.  Finally,  the  form  of  the  variance 
structure  of  the  random  components  should  be  generalized  to 
include  cases  other  than  when  the  errors  are  independent  such 
as  when  they  are  autocorrelated . 


156 


BIBLIOGRAPHY 


Algina,  J.  and  Crocker,  L.  (1986).  Introduction  to  Classical  and 
Modern  Test  Theory.  Holt,  Reinhart,  and  Winston,  New  York. 

Arvesen,  J.N.  (1969).  Jackknifing  U-statistics . Ann.  Math.  Statist., 
40,  2076-2100. 

Arvesen,  J.N.  and  Layard,  M.W.J.  (1975).  Asymptotically  robust  tests 
in  unbalanced  variance  component  models.  Ann . Statist . , 3, 
1122-1134. 

Battese,  G.E. , Harter,  R.M.,  and  Fuller,  W.A.  (1988).  An  error 
components  model  for  prediction  of  county  crop  areas  using 
survey  and  satellite  data.  J.  Amer.  Statist.  Assoc.,  83,  28-36. 

Cochran,  W.G.  (1939).  The  use  of  analysis  of  variance  in 

enumeration  by  sampling.  J.  Amer.  Statist.  Assoc.,  34, 

492-510. 

Corbeil,  R.R.  and  Searle,  S.R.  (1976).  A comparison  of  variance 
component  estimators.  Biometrics . 32,  779-791. 

Crump,  S.L.  (1951).  The  present  status  of  variance  component 
analysis.  Biometrics . 7,  1-16. 

Datta,  G.S.,  (1990).  Bayesian  prediction  in  mixed  linear  models  with 
models  with  applications  in  small  area  estimation.  Ph.D. 
dissertation,  University  of  Florida. 

Datta,  G.S. , and  Ghosh,  M.  (1988).  Asymptotic  optimality  of 

hierarchical  Bayes  estimators  and  predictors.  Tech  Report  No 
349.  Dept,  of  Statistics,  University  of  Florida 

Dempster,  A.P.,  Rubin,  D.B.,  and  Tsutakawa,  R.K.  (1981).  Estimation 
in  covariance  components  model.  J.  Amer. Statist.  Assoc.,  76, 
341-353. 

Efron,  B.  (1982).  The  Jackknife,  the  Bootstrap  and  Other  Resampling 
Plans,  Society  for  Industrial  and  Applied  Mathematics, 
Philadelphia. 

Fay,  R.E. , and  Herriot,  R.  (1979).  Estimates  of  income  for  small 
places:  an  application  of  James-Stein  procedures  to  census 

data.  J.  Amer.  Statist.  Assoc..  74.  269-277. 


157 


158 


Fuller,  W.  A.  and  Harter,  R.M.  (1987).  The  multivariate  components  of 
a model  for  small  area  estimation.  Small  Area  Statistics;  An 
International  Symposium.  Eds.:  R.  Platek,  J.N.K.  Rao,  C.E. 

Sarnal,  and  M.P.  Singh.  Wiley,  New  York,  pp.  103-123. 

Ghosh,  M.  (1989).  Hierarchical  and  empirical  Bayes  multivariate 
multivariate  estimation.  Tech.  Report  No.  330,  Dept,  of 
Statistics,  University  of  Florida. 

Ghosh,  M.  and  Lahiri,  P.  (1987a).  Robust  empirical  Bayes  estimation 
estimation  of  means  from  stratified  samples.  J.  Amer .Statist. 
Assoc. . 82,  1153-1162. 

Ghosh,  M.  and  Lahiri,  P.  (1987b).  Robust  empirical  Bayes  estimation 
estimation  of  means  from  stratified  samples.  Sankhy  B, 

78-89. 

Ghosh,  M.  and  Lahiri,  P.  (1988).  Bayes  and  empirical  Bayes  analysis 
analysis  in  multistage  sampling.  Statistical  Decision  Theory 
and  Related  Topics,  V4,  1.  Eds.:  S.S.  Gupta  and  J.O.  Berger. 

Springer-Verlag,  New  York,  pp.  195-212. 

Ghosh,  M.  and  Meeden,  G.  (1986).  Empirical  Bayes  estimation  in 
finite  population  sampling.  J.  Amer.  Statist. Assoc ■ , 81_, 
1058-1062. 

Ghosh,  M.  and  Rao,  J.N.K.  (1991).  Small  area  estimation. 

Under  preparation. 

Gianola,  D.  and  Fernando,  R.L.  (1986).  Bayesian  methods  in  animal 
breeding  theory.  J . Anim.  Sci  . , 63,  217-244. 

Good,  I.J.  (1965).  The  Estimation  of  Probabilities.  M.I.T.  Press, 
Cambridge,  Massachusetts. 

Hartigan,  J.A.  (1969).  Linear  Bayes  methods.  J.  Roy. Statist.  Soc.,  B, 
31,  446-454. 

Harville,  D.A.  (1976).  Extension  of  Gauss-Markov  theorem  to  include 
include  the  estimation  of  random  effects.  Ann. Statist. , 4, 
384-395. 

Harville,  D.A.  (1977).  Maximum  likelihood  approaches  to  variance 

variance  component  estimation  and  to  related  problems.  J . Amer . 
Statist.  Assoc..  72,  320-340. 

Harville,  D.A.  (1985).  Decomposition  of  prediction  error.  J . Amer . 
Statist . Assoc . , 80,  132-138. 

Harville,  D.A.  (1988).  Mixed-model  methodology:  theoretical 
justifications  and  future  directions.  Proceedings  of  the 
Statistical  Computing  Section  of  the  ASA,  41-49. 


159 


Harville,  D.A.  (1990)  The  multivariate  components  of  a model  for 
small  area  estimation.  Advances  in  Statistical  Methods  for 
Genetic  Improvement  of  Livestock.  Eds.:  D.  Gianola  and  K. 

Hammond  Springer-Verlag,  New  York,  pp  103-123. 

Henderson,  C.R.  (1950).  Estimates  of  genetic  parameters.  Ann.  Math. 
Statist.  21 . 309. 

Henderson,  C.R.  (1953).  Estimation  of  variance  and  covariance 
covariance  components.  Biometrics . 9,  226-252. 

Henderson,  C.R.  (1963).  Selection  index  and  expected  genetic 

advance.  Statistical  Genetics  and  Plant  Breeding.  Eds.:  W.D. 
Hanson  and  H.F.  Robinson.  National  Academy  of  Sciences  National 
Research  Council,  Washington.  Publication  982,  141-163. 

Henderson,  C.R.  (1975).  Best  linear  unbiased  estimation  and 

prediction  under  a selection  model.  Biometrics . 31 . 423-447. 

Henderson,  C.R. , Kempthorne,  0.,  Searle,  S.R.,  and  von  Krosigh,  C.M. 
(1959).  The  estimation  of  environmental  and  genetic  trends  from 
records  subject  to  culling.  Biometrics . 15.  192. 

Henderson,  H.V.  and  Searle,  S.R.  (1981).  On  deriving  the  inverse  of  a 
sum  of  matrices.  SIAM  Review.  23,  53-60. 

Kackar,  R.N.  and  Harville,  D.A.  (1984).  Approximations  for  standard 
errors  of  estimators  of  fixed  and  random  effects  in  mixed  linear 
models.  J.  Amer.  Statist.  Assoc.,  79,  853-862. 

Lahiri,  P.  and  Rao,  J.N.K.  (1990).  Robust  estimation  of  mean  square 
error  of  empirical  best  linear  unbiased  predictors.  Preprint. 

Lindley,  D.V.  and  Smith,  A.F.M.  (1972).  Bayes  estimates  for  linear 
model  (with  discussion).  J.  Roy.  Statist.  Soc.,  B,  34,  1-41. 

Loeve,  M.  (1963)  Probability  Theory,  3rd  Edition.  Van  Nostrand, 
Princeton . 

Lush,  J.L.  and  Shrode,  R.R.  (1950) . Changes  in  milk  production  with 
age  and  milking  frequency.  J ■ Dairy  Sci . , 33,  338. 

Miller,  R.G.  (1974).  An  unbalanced  jackknife.  Ann.  Statist.,  2, 
880-891. 

Olsen,  A.,  Seely,  J.,  and  Birkes,  D (1976).  Invariant  quadratic 

unbiased  estimation  for  two  variance  components.  Ann . Statist. . 
4,  878-890. 

Penrose,  R.A.  (1955).  A generalized  inverse  for  matrices.  Proc . 
Cambridge  Philos.  Soc.,  54,  406-413. 


160 


Portnoy,  S.  (1971).  Formal  Bayes  estimation  with  application  to  a 
random  effects  model.  Ann.  Math.  Statist.,  42,  1379-1402. 

Prasad,  N.G.N.  and  Rao,  J.N.K.  (1990).  The  estimation  of  mean 
squared  error  of  small  area  predictors.  J.  Amer .Statist. 

Assoc . , 85,  163-171. 

Pukelsheim,  F.  (1977).  Estimating  variance  components  in  linear 
models.  J . Mult . Anal . , 6,  626-629. 

Pukelsheim,  F.  (1981).  On  the  existence  of  unbiased  nonnegative 

estimates  of  variance  covariance  components.  Ann.  Statist. , 9, 
293-299. 

Quenouille,  M.H.  (1949).  Approximate  tests  of  correlation  in  time 
series.  J.  Roy  Statist.  Soc.,  B,  11,  68-84. 

Quenouille,  M.H.  (1956).  Notes  on  bias  in  estimation.  Biometrika.  43, 
353-360. 

Rao,  C.R.  and  Kleffe,  J.  (1988).  Estimation  of  Variance  Components. 
North-Holland,  Amsterdam. 

Shavelson , J.R.  and  Webb,  C.  (1981).  Generalizability  theory:  1973- 
1980.  Brit.  J.  Math.  Statist.  Psych.,  34,  133-166. 

Smith,  D.  W.  and  Murray,  L.  W.  (1984).  An  alternative  to  Eisenhart’s 
model  II  and  mixed  model  in  the  case  of  negative  variance 
estimates.  J.  Amer.  Statist.  Assoc.,  79,  145-151. 

Snedecor,  G.W.  and  Cochran,  W.G.  (1967).  Statistical  Methods,  6th 
Edition,  Iowa  State  University  Press,  Ames, Iowa. 

Tukey,  J.W  (1958).  Bias  and  confidence  in  not-quite  large  samples. 
Ann.  Math.  Statist.,  29,  43-56. 

Westfall,  P.H.  (1986).  Asymptotic  normality  of  the  Anova  estimates  of 
components  of  variance  in  the  nonnormal,  unbalanced  hierarchical 
mixed  model.  Ann.  Statist.,  14.  1572-1582. 

Whittle,  p.  (1960).  Bounds  for  moments  of  linear  and  quadratic  forms 
in  independent  random  variables.  Theory  Probab.  Appl . , 5, 
302-305. 

Yates,  F.  and  Zacopancy,  I.  (1935).  The  estimation  of  the  efficiency 
of  sampling,  with  special  reference  to  sampling  for  yield  in 
cereal  experiments.  J.  Agricultural  Science,  25,  545-577. 


BIOGRAPHICAL  SKETCH 


Robert  M.  Baskin  was  born  on  December  1,  1949,  in  Little  Rock 
Arkansas.  After  graduating  from  the  University  of  Arkansas  at  Little 
Rock,  in  1976,  with  a Bachelor  of  Arts  in  mathematics  he  joined  the 
Department  of  Mathematics  at  Oklahoma  State  University  where  he 
received  a Master  of  Science  degree  in  mathematics  in  1979.  He 
expects  to  receive  a Doctor  of  Philosophy  in  statistics  from  the 
University  of  Florida  in  August  1991. 


161 


I certify  that  I have  read  this  study  and  that  in  my  opinion  it 
conforms  to  acceptable  standards  of  scholarly  presentation  and  is  fully 
adequate,  in  scope  and  quality,  as  a dissertation  for  the  degree  of 
Doctor  of  Philosophy. 


QVuO/^i- 

Malay  Ghosn,  Chairman 
Professor  of  Statistics 


I certify  that  I have  read  this  study  and  that  in  my  opinion  it 
conforms  to  acceptable  standards  of  scholarly  presentation  and  is  fully 
adequate,  in  scope  and  quality,  as  a dissertation  for  the  degree  of 
Doctor  of  Philosophy. 

C )/.  dctoMbd- 

Ronald  H.  Randles 
Professor  of  Statistics 


1 certify  that  I have  read  this  study  and  that  in  my  opinion  it 
conforms  to  acceptable  standards  of  scholarly  presentation  and  is  fully 
adequate,  in  scope  and  quality,  as  a dissertation  for  the  degree  of 
Doctor  of  Philosophy. 

(Uq 

Pejaver  V.  Rao 
Professor  of  Statistics 


I certify  that  I have  read  this 
conforms  to  acceptable  standards  of 
adequate,  in  scope  and  quality,  as  a 
Doctor  of  Philosophy. 


study  and  that  in  my  opinion  it 
scholarly  presentation  and  is  fully 
dissertation  for  the  degree  of 


Professor  of  Statistics 


1 certify  that  I have  read  this  study  and  that  in  my  opinion  it 
conforms  to  acceptable  standards  of  scholarly  presentation  and  is  fully 
adequate,  in  scope  and  quality,  as  a dissertation  for  the  degree  of 
Doctor  of  Philosophy. 


VPt 


Joseph  Glover 
rofessor  of  Mathematics 


This  dissertation  was  submitted  to  the  Graduate  Faculty  of  the 
Department  of  Statistics  in  the  College  of  Liberal  Arts  and  Sciences 
and  to  the  Graduate  School  and  was  accepted  as  partial  fulfillment  of 
the  requirements  for  the  degree  Doctor  of  Philosophy. 

August,  1991 


Dean,  Graduate  School 


